An Engine for Research
Books, of course, have always been about technology. The book is a technological device, and innovations in book publishing technology have been many over the years. The advent of the Internet and the dramatic rise of the World Wide Web, however, have led us into the largely uncharted waters of on-line digital publishing. We are only beginning to feel our way into the possibilities of “publishing” on the Web; the future appears promising, but the path, or paths, are difficult to follow. Today I will give you a progress report on a work in progress—the Handbook of Texas Online.
The Handbook of Texas
On February 13, 1897, ten individuals met on the University of Texas campus in Austin to discuss their shared interest in Texas history, a discussion that would lead to the founding of the Texas State Historical Association in March of that year. The association’s charter members identified several key objectives for the new organization, including a comprehensive history of the state. Forty-three years later, University of Texas historian and TSHA director Walter Prescott Webb determined to make good on that objective by launching a program to develop a comprehensive encyclopedic history of Texas. He called it the Handbook of Texas.
It was published in 1952—two volumes totaling almost 2,000 pages and containing more than 18,000 entries on a vast range of topics related to Texas history, culture, and geography. The new encyclopedia quickly won acclaim as a landmark publication in state and local history, receiving international recognition as a model of excellence for regional history. A supplemental volume published in 1976 added another 1,000 pages and began a long-term process of adding entries to cover new developments and expanding the overall content to include recent trends in historical scholarship.
In 1982 TSHA launched a fourteen-year project to completely revise, update, and expand the Handbook of Texas. To develop an encyclopedia that would serve into the next millennium, the association built an impressive coalition that included:
• 28 institutions of higher learning serving as co-sponsors
• 64 noted scholars serving as advisory editors for major subject areas
• 600 readers with specialized knowledge reviewing articles for general accuracy
• 3,000 individuals contributing one or more articles
More than 60 charitable foundations and hundreds of individuals contributed financially toward the multimillion-dollar cost of this ambitious endeavor. Their collective efforts resulted in the New Handbook of Texas, published in 1996. Now filling six volumes and encompassing almost 7,000 pages, the New Handbook features more than 24,000 articles and 700 illustrations.
The Handbook of Texas Online
Publication of the revised and expanded Handbook represented a tremendous accomplishment that in many ways exceeded TSHA’s expectations and certainly marked a historical high point in the Handbook’s publication history. The New Handbook continues to serve as a magnificent resource today, but its tenure as high point would last less than three years. The seeds of its successor were sown in early 1997 when Harold Billings, director of the University of Texas General Libraries, asked Ron Tyler when TSHA would put the Handbook on the Internet.
Frankly, the question took us by surprise. TSHA had planned since the late 1980s to develop electronic dimensions to the Handbook, but those ideas were very general in nature and not very well informed by technological expertise. Initial efforts in the area of information technology had been aimed at facilitating project management and streamlining editorial processes. As a result, when the New Handbook was published, all of the entries existed in electronic form as word-processing files. By 1992, the project staff was thinking of possible electronic products, and as the New Handbook neared completion, it was generally assumed that a CD-ROM version probably represented the first step into electronic publishing. In 1997, we were aware of the Internet and the emerging phenomenon known as the World Wide Web, but it was unfamiliar territory whose potential as a publishing medium was uncertain, to say the least.
We were fortunate to find a partner in the Digital Library Services Division of UT General Libraries that was an early pioneer of Internet publishing. Billings and Mark McFarland, director of DLSD, urged TSHA to consider going straight to the Internet with an on-line version of the Handbook that would be available to anyone with access to a Web browser. They made a convincing case for the Web’s emerging power as a communications medium, and in the spring of 1997 we launched a collaborative venture to develop an on-line version of the Handbook of Texas. Our initial objectives included
• Publishing digital content in a fully searchable on-line environment accessible via a Web browser;
• Pursuing a digital conversion project aimed at maximizing access to content; and
• Developing a Web-based publication that would complement rather than compete with the print edition.
During the spring and summer of 1997, our partners at UT General Libraries converted the Handbook’s 24,000 text files to the HTML format used by Web browsers and designed the initial user interface. Developing the full-text database environment and programming the search engine occupied the project team from the fall of 1997 through the summer of 1998. The initial site was released to the UT Austin campus in the fall of 1998 as a test site, after which the site was formally released to the public on February 15, 1999. At its public debut, held at the Center for American History on the UT Austin campus, the Handbook of Texas Online provided, free of charge, the following:
• Full text of all articles published in the 1996 print edition, plus 400 articles not included due to space constraints
• Browse features that allowed readers to look through alphabetical lists of article titles, with links to the individual articles
• A search engine that provided full-text and Boolean search capabilities
It was a wonderful party! Leaders from the University of Texas and from TSHA jointly celebrated a fabulous product resulting from a tremendous partnership. And yet one question lingered in our minds. Would readers use it? We had built it, but would they come?
We measure user response in two basic ways: statistics gathered from Handbook servers that count the number of pages (page views) requested by users as well as feedback received from our readers, generally via email.The following is a statistical summary of Handbook of Texas
Online traffic to date:
200,000 page views
1,000,000 page views
2,000,000 page views
2,300,000 page views
Currently, we are averaging about 1.5 million page views per month.
E-mail from users provides a second important measure of activity, and the Handbook site provides three feedback mechanisms: two on-line forms for suggesting corrections and new entries and a general-purpose e-mail message. To date, we have received more than 12,000 e-mails and currently average around 20 new e-mails per day. The messages span the gamut of praise for the Handbook to complaints about errors or omissions to requests for help with all manner of needs related to Texas (including frequent requests for homework assistance).
The Digital Difference: What We’ve Learned to Date
In a little over two months, we will celebrate the on-line Handbook’s fifth anniversary. In that time, our servers have successfully responded to more than 60 million requests for pages and have transferred 2.6 terabytes of information to users. Much of what we have learned over the past five years can be summarized with the phrase “Digital is different.”
To begin with, our readers are different—they are not the users we expected to encounter. Our initial, somewhat ill-informed assumption had been that the site would be used primarily by readers who were familiar with the print edition and interested in taking advantage of the site’s electronic search capabilities. We also assumed that, for the most part, our users would be Texans or displaced Texans with only incidental use outside the state. All of these assumptions proved incorrect. Anecdotal information collected from user e-mails indicates that many, if not most, of the site’s users were unfamiliar with the Handbook prior to visiting it on-line. This impression is reinforced by server statistics, which indicate that more than 60 percent of Handbook traffic is referred to the site from external search engines. And far from being primarily located in Texas, our on-line users come from every state in the United States and from sixty-four countries around the world. While the majority of the page views originate with Texas-based users, the on-line Handbook has developed an international audience. It also quickly became clear from the nature of submitted queries that many search engine users did not intend to arrive at a Texas history encyclopedia. In fact, a good number had difficulty realizing they had arrived at an on-line encyclopedia once they reached our pages. For many readers, the Handbook of Texas Online proved a pleasant discovery, but that discovery process has presented the editors with new challenges in effectively presenting the Handbook’s content to an audience that has less familiarity with the basic source than users of previous editions.
A second lesson about which we feel confident is that Internet usage levels are huge, sustained, and growing. Our early concerns about attracting users have been swept away by the challenges of serving and responding to the avalanche of traffic generated by the on-line Handbook. Activity reports from the Handbook’s servers indicate significant activity levels twenty-four hours a day, seven days a week, with seasonal peaks that roughly align with academic calendars. Given the rapidly expanding availability of high-speed broadband Internet access and the early stage of Internet-specific content for the on-line Handbook, it seems likely that peak monthly usage levels will approach or exceed 3 million page views in the next few years.
What can we say about why users are coming to the on-line Handbook? At least two things, based on the nature of the current site and email comments from readers: content is key, and text still matters, even in an age of digital multimedia. When we compare our traffic levels to those of other sites related to state and local history, the most immediate contrast
is in the sheer volume of content—24,000 articles spanning a vast range of subject matter. It is very easy for many different kinds of readers to find something useful in this massive accumulation of material. The usage rates also speak to the core value of concise, authoritative content. Compared with most electronic encyclopedias, the Handbook of Texas Online contains a paucity of multimedia: fewer than 100 images and only a handful of pilot audio and video clips. It is a shortcoming that we are working hard to address, but in the meantime users flock to the site and laud its tremendous value as a research tool.
A final observation that we believe is germane to all publishers planning to operate on the Internet is that, compared with print readers, Web users talk back! The frequency, volume, and nature of e-mail communications directed to the on-line Handbook far exceed any comparable level of feedback from users of the print editions for several reasons. First, the ease of e-mail communication generally encourages users to comment far more than they did even in the recent past. Second, the relative newness and immediacy of the World Wide Web encourage readers to expect on-line material to be up to date and continuously revised more than print works in general. Finally, and perhaps most important, the unexpected arrival of many readers at the on-line Handbook site via search engines means that they encounter the material more often as a
discovery experience than our print readers do. A six-volume, 7,000-page encyclopedia is something that most readers pull from a library shelf with a familiar sense of what they will find within it. Not so for many of our Internet visitors. And as their expectations diverge from what the Handbook delivers, the Internet provides a dynamic medium to make the encounter a two-way learning experience, both for the user and the publisher. It is this potential for discovery that promises to be one of the Internet’s greatest promises for learning and communication.