You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@orc.apache.org by om...@apache.org on 2017/07/24 17:49:33 UTC
[31/51] [partial] orc git commit: ORC-204 Update and use CMake
External Project to build C++ compression libraries.
http://git-wip-us.apache.org/repos/asf/orc/blob/590245a0/c++/libs/snappy-1.1.2/testdata/lcet10.txt
----------------------------------------------------------------------
diff --git a/c++/libs/snappy-1.1.2/testdata/lcet10.txt b/c++/libs/snappy-1.1.2/testdata/lcet10.txt
deleted file mode 100644
index 25dda6b..0000000
--- a/c++/libs/snappy-1.1.2/testdata/lcet10.txt
+++ /dev/null
@@ -1,7519 +0,0 @@
-
-
-The Project Gutenberg Etext of LOC WORKSHOP ON ELECTRONIC TEXTS
-
-
-
-
- WORKSHOP ON ELECTRONIC TEXTS
-
- PROCEEDINGS
-
-
-
- Edited by James Daly
-
-
-
-
-
-
-
- 9-10 June 1992
-
-
- Library of Congress
- Washington, D.C.
-
-
-
- Supported by a Grant from the David and Lucile Packard Foundation
-
-
- *** *** *** ****** *** *** ***
-
-
- TABLE OF CONTENTS
-
-
-Acknowledgements
-
-Introduction
-
-Proceedings
- Welcome
- Prosser Gifford and Carl Fleischhauer
-
- Session I. Content in a New Form: Who Will Use It and What Will They Do?
- James Daly (Moderator)
- Avra Michelson, Overview
- Susan H. Veccia, User Evaluation
- Joanne Freeman, Beyond the Scholar
- Discussion
-
- Session II. Show and Tell
- Jacqueline Hess (Moderator)
- Elli Mylonas, Perseus Project
- Discussion
- Eric M. Calaluca, Patrologia Latina Database
- Carl Fleischhauer and Ricky Erway, American Memory
- Discussion
- Dorothy Twohig, The Papers of George Washington
- Discussion
- Maria L. Lebron, The Online Journal of Current Clinical Trials
- Discussion
- Lynne K. Personius, Cornell mathematics books
- Discussion
-
- Session III. Distribution, Networks, and Networking:
- Options for Dissemination
- Robert G. Zich (Moderator)
- Clifford A. Lynch
- Discussion
- Howard Besser
- Discussion
- Ronald L. Larsen
- Edwin B. Brownrigg
- Discussion
-
- Session IV. Image Capture, Text Capture, Overview of Text and
- Image Storage Formats
- William L. Hooton (Moderator)
- A) Principal Methods for Image Capture of Text:
- direct scanning, use of microform
- Anne R. Kenney
- Pamela Q.J. Andre
- Judith A. Zidar
- Donald J. Waters
- Discussion
- B) Special Problems: bound volumes, conservation,
- reproducing printed halftones
- George Thoma
- Carl Fleischhauer
- Discussion
- C) Image Standards and Implications for Preservation
- Jean Baronas
- Patricia Battin
- Discussion
- D) Text Conversion: OCR vs. rekeying, standards of accuracy
- and use of imperfect texts, service bureaus
- Michael Lesk
- Ricky Erway
- Judith A. Zidar
- Discussion
-
- Session V. Approaches to Preparing Electronic Texts
- Susan Hockey (Moderator)
- Stuart Weibel
- Discussion
- C.M. Sperberg-McQueen
- Discussion
- Eric M. Calaluca
- Discussion
-
- Session VI. Copyright Issues
- Marybeth Peters
-
- Session VII. Conclusion
- Prosser Gifford (Moderator)
- General discussion
-
-Appendix I: Program
-
-Appendix II: Abstracts
-
-Appendix III: Directory of Participants
-
-
- *** *** *** ****** *** *** ***
-
-
- Acknowledgements
-
-I would like to thank Carl Fleischhauer and Prosser Gifford for the
-opportunity to learn about areas of human activity unknown to me a scant
-ten months ago, and the David and Lucile Packard Foundation for
-supporting that opportunity. The help given by others is acknowledged on
-a separate page.
-
- 19 October 1992
-
-
- *** *** *** ****** *** *** ***
-
-
- INTRODUCTION
-
-The Workshop on Electronic Texts (1) drew together representatives of
-various projects and interest groups to compare ideas, beliefs,
-experiences, and, in particular, methods of placing and presenting
-historical textual materials in computerized form. Most attendees gained
-much in insight and outlook from the event. But the assembly did not
-form a new nation, or, to put it another way, the diversity of projects
-and interests was too great to draw the representatives into a cohesive,
-action-oriented body.(2)
-
-Everyone attending the Workshop shared an interest in preserving and
-providing access to historical texts. But within this broad field the
-attendees represented a variety of formal, informal, figurative, and
-literal groups, with many individuals belonging to more than one. These
-groups may be defined roughly according to the following topics or
-activities:
-
-* Imaging
-* Searchable coded texts
-* National and international computer networks
-* CD-ROM production and dissemination
-* Methods and technology for converting older paper materials into
-electronic form
-* Study of the use of digital materials by scholars and others
-
-This summary is arranged thematically and does not follow the actual
-sequence of presentations.
-
-NOTES:
- (1) In this document, the phrase electronic text is used to mean
- any computerized reproduction or version of a document, book,
- article, or manuscript (including images), and not merely a machine-
- readable or machine-searchable text.
-
- (2) The Workshop was held at the Library of Congress on 9-10 June
- 1992, with funding from the David and Lucile Packard Foundation.
- The document that follows represents a summary of the presentations
- made at the Workshop and was compiled by James DALY. This
- introduction was written by DALY and Carl FLEISCHHAUER.
-
-
-PRESERVATION AND IMAGING
-
-Preservation, as that term is used by archivists,(3) was most explicitly
-discussed in the context of imaging. Anne KENNEY and Lynne PERSONIUS
-explained how the concept of a faithful copy and the user-friendliness of
-the traditional book have guided their project at Cornell University.(4)
-Although interested in computerized dissemination, participants in the
-Cornell project are creating digital image sets of older books in the
-public domain as a source for a fresh paper facsimile or, in a future
-phase, microfilm. The books returned to the library shelves are
-high-quality and useful replacements on acid-free paper that should last
-a long time. To date, the Cornell project has placed little or no
-emphasis on creating searchable texts; one would not be surprised to find
-that the project participants view such texts as new editions, and thus
-not as faithful reproductions.
-
-In her talk on preservation, Patricia BATTIN struck an ecumenical and
-flexible note as she endorsed the creation and dissemination of a variety
-of types of digital copies. Do not be too narrow in defining what counts
-as a preservation element, BATTIN counseled; for the present, at least,
-digital copies made with preservation in mind cannot be as narrowly
-standardized as, say, microfilm copies with the same objective. Setting
-standards precipitously can inhibit creativity, but delay can result in
-chaos, she advised.
-
-In part, BATTIN's position reflected the unsettled nature of image-format
-standards, and attendees could hear echoes of this unsettledness in the
-comments of various speakers. For example, Jean BARONAS reviewed the
-status of several formal standards moving through committees of experts;
-and Clifford LYNCH encouraged the use of a new guideline for transmitting
-document images on Internet. Testimony from participants in the National
-Agricultural Library's (NAL) Text Digitization Program and LC's American
-Memory project highlighted some of the challenges to the actual creation
-or interchange of images, including difficulties in converting
-preservation microfilm to digital form. Donald WATERS reported on the
-progress of a master plan for a project at Yale University to convert
-books on microfilm to digital image sets, Project Open Book (POB).
-
-The Workshop offered rather less of an imaging practicum than planned,
-but "how-to" hints emerge at various points, for example, throughout
-KENNEY's presentation and in the discussion of arcana such as
-thresholding and dithering offered by George THOMA and FLEISCHHAUER.
-
-NOTES:
- (3) Although there is a sense in which any reproductions of
- historical materials preserve the human record, specialists in the
- field have developed particular guidelines for the creation of
- acceptable preservation copies.
-
- (4) Titles and affiliations of presenters are given at the
- beginning of their respective talks and in the Directory of
- Participants (Appendix III).
-
-
-THE MACHINE-READABLE TEXT: MARKUP AND USE
-
-The sections of the Workshop that dealt with machine-readable text tended
-to be more concerned with access and use than with preservation, at least
-in the narrow technical sense. Michael SPERBERG-McQUEEN made a forceful
-presentation on the Text Encoding Initiative's (TEI) implementation of
-the Standard Generalized Markup Language (SGML). His ideas were echoed
-by Susan HOCKEY, Elli MYLONAS, and Stuart WEIBEL. While the
-presentations made by the TEI advocates contained no practicum, their
-discussion focused on the value of the finished product, what the
-European Community calls reusability, but what may also be termed
-durability. They argued that marking up--that is, coding--a text in a
-well-conceived way will permit it to be moved from one computer
-environment to another, as well as to be used by various users. Two
-kinds of markup were distinguished: 1) procedural markup, which
-describes the features of a text (e.g., dots on a page), and 2)
-descriptive markup, which describes the structure or elements of a
-document (e.g., chapters, paragraphs, and front matter).
-
-The TEI proponents emphasized the importance of texts to scholarship.
-They explained how heavily coded (and thus analyzed and annotated) texts
-can underlie research, play a role in scholarly communication, and
-facilitate classroom teaching. SPERBERG-McQUEEN reminded listeners that
-a written or printed item (e.g., a particular edition of a book) is
-merely a representation of the abstraction we call a text. To concern
-ourselves with faithfully reproducing a printed instance of the text,
-SPERBERG-McQUEEN argued, is to concern ourselves with the representation
-of a representation ("images as simulacra for the text"). The TEI proponents'
-interest in images tends to focus on corollary materials for use in teaching,
-for example, photographs of the Acropolis to accompany a Greek text.
-
-By the end of the Workshop, SPERBERG-McQUEEN confessed to having been
-converted to a limited extent to the view that electronic images
-constitute a promising alternative to microfilming; indeed, an
-alternative probably superior to microfilming. But he was not convinced
-that electronic images constitute a serious attempt to represent text in
-electronic form. HOCKEY and MYLONAS also conceded that their experience
-at the Pierce Symposium the previous week at Georgetown University and
-the present conference at the Library of Congress had compelled them to
-reevaluate their perspective on the usefulness of text as images.
-Attendees could see that the text and image advocates were in
-constructive tension, so to say.
-
-Three nonTEI presentations described approaches to preparing
-machine-readable text that are less rigorous and thus less expensive. In
-the case of the Papers of George Washington, Dorothy TWOHIG explained
-that the digital version will provide a not-quite-perfect rendering of
-the transcribed text--some 135,000 documents, available for research
-during the decades while the perfect or print version is completed.
-Members of the American Memory team and the staff of NAL's Text
-Digitization Program (see below) also outlined a middle ground concerning
-searchable texts. In the case of American Memory, contractors produce
-texts with about 99-percent accuracy that serve as "browse" or
-"reference" versions of written or printed originals. End users who need
-faithful copies or perfect renditions must refer to accompanying sets of
-digital facsimile images or consult copies of the originals in a nearby
-library or archive. American Memory staff argued that the high cost of
-producing 100-percent accurate copies would prevent LC from offering
-access to large parts of its collections.
-
-
-THE MACHINE-READABLE TEXT: METHODS OF CONVERSION
-
-Although the Workshop did not include a systematic examination of the
-methods for converting texts from paper (or from facsimile images) into
-machine-readable form, nevertheless, various speakers touched upon this
-matter. For example, WEIBEL reported that OCLC has experimented with a
-merging of multiple optical character recognition systems that will
-reduce errors from an unacceptable rate of 5 characters out of every
-l,000 to an unacceptable rate of 2 characters out of every l,000.
-
-Pamela ANDRE presented an overview of NAL's Text Digitization Program and
-Judith ZIDAR discussed the technical details. ZIDAR explained how NAL
-purchased hardware and software capable of performing optical character
-recognition (OCR) and text conversion and used its own staff to convert
-texts. The process, ZIDAR said, required extensive editing and project
-staff found themselves considering alternatives, including rekeying
-and/or creating abstracts or summaries of texts. NAL reckoned costs at
-$7 per page. By way of contrast, Ricky ERWAY explained that American
-Memory had decided from the start to contract out conversion to external
-service bureaus. The criteria used to select these contractors were cost
-and quality of results, as opposed to methods of conversion. ERWAY noted
-that historical documents or books often do not lend themselves to OCR.
-Bound materials represent a special problem. In her experience, quality
-control--inspecting incoming materials, counting errors in samples--posed
-the most time-consuming aspect of contracting out conversion. ERWAY
-reckoned American Memory's costs at $4 per page, but cautioned that fewer
-cost-elements had been included than in NAL's figure.
-
-
-OPTIONS FOR DISSEMINATION
-
-The topic of dissemination proper emerged at various points during the
-Workshop. At the session devoted to national and international computer
-networks, LYNCH, Howard BESSER, Ronald LARSEN, and Edwin BROWNRIGG
-highlighted the virtues of Internet today and of the network that will
-evolve from Internet. Listeners could discern in these narratives a
-vision of an information democracy in which millions of citizens freely
-find and use what they need. LYNCH noted that a lack of standards
-inhibits disseminating multimedia on the network, a topic also discussed
-by BESSER. LARSEN addressed the issues of network scalability and
-modularity and commented upon the difficulty of anticipating the effects
-of growth in orders of magnitude. BROWNRIGG talked about the ability of
-packet radio to provide certain links in a network without the need for
-wiring. However, the presenters also called attention to the
-shortcomings and incongruities of present-day computer networks. For
-example: 1) Network use is growing dramatically, but much network
-traffic consists of personal communication (E-mail). 2) Large bodies of
-information are available, but a user's ability to search across their
-entirety is limited. 3) There are significant resources for science and
-technology, but few network sources provide content in the humanities.
-4) Machine-readable texts are commonplace, but the capability of the
-system to deal with images (let alone other media formats) lags behind.
-A glimpse of a multimedia future for networks, however, was provided by
-Maria LEBRON in her overview of the Online Journal of Current Clinical
-Trials (OJCCT), and the process of scholarly publishing on-line.
-
-The contrasting form of the CD-ROM disk was never systematically
-analyzed, but attendees could glean an impression from several of the
-show-and-tell presentations. The Perseus and American Memory examples
-demonstrated recently published disks, while the descriptions of the
-IBYCUS version of the Papers of George Washington and Chadwyck-Healey's
-Patrologia Latina Database (PLD) told of disks to come. According to
-Eric CALALUCA, PLD's principal focus has been on converting Jacques-Paul
-Migne's definitive collection of Latin texts to machine-readable form.
-Although everyone could share the network advocates' enthusiasm for an
-on-line future, the possibility of rolling up one's sleeves for a session
-with a CD-ROM containing both textual materials and a powerful retrieval
-engine made the disk seem an appealing vessel indeed. The overall
-discussion suggested that the transition from CD-ROM to on-line networked
-access may prove far slower and more difficult than has been anticipated.
-
-
-WHO ARE THE USERS AND WHAT DO THEY DO?
-
-Although concerned with the technicalities of production, the Workshop
-never lost sight of the purposes and uses of electronic versions of
-textual materials. As noted above, those interested in imaging discussed
-the problematical matter of digital preservation, while the TEI proponents
-described how machine-readable texts can be used in research. This latter
-topic received thorough treatment in the paper read by Avra MICHELSON.
-She placed the phenomenon of electronic texts within the context of
-broader trends in information technology and scholarly communication.
-
-Among other things, MICHELSON described on-line conferences that
-represent a vigorous and important intellectual forum for certain
-disciplines. Internet now carries more than 700 conferences, with about
-80 percent of these devoted to topics in the social sciences and the
-humanities. Other scholars use on-line networks for "distance learning."
-Meanwhile, there has been a tremendous growth in end-user computing;
-professors today are less likely than their predecessors to ask the
-campus computer center to process their data. Electronic texts are one
-key to these sophisticated applications, MICHELSON reported, and more and
-more scholars in the humanities now work in an on-line environment.
-Toward the end of the Workshop, Michael LESK presented a corollary to
-MICHELSON's talk, reporting the results of an experiment that compared
-the work of one group of chemistry students using traditional printed
-texts and two groups using electronic sources. The experiment
-demonstrated that in the event one does not know what to read, one needs
-the electronic systems; the electronic systems hold no advantage at the
-moment if one knows what to read, but neither do they impose a penalty.
-
-DALY provided an anecdotal account of the revolutionizing impact of the
-new technology on his previous methods of research in the field of classics.
-His account, by extrapolation, served to illustrate in part the arguments
-made by MICHELSON concerning the positive effects of the sudden and radical
-transformation being wrought in the ways scholars work.
-
-Susan VECCIA and Joanne FREEMAN delineated the use of electronic
-materials outside the university. The most interesting aspect of their
-use, FREEMAN said, could be seen as a paradox: teachers in elementary
-and secondary schools requested access to primary source materials but,
-at the same time, found that "primariness" itself made these materials
-difficult for their students to use.
-
-
-OTHER TOPICS
-
-Marybeth PETERS reviewed copyright law in the United States and offered
-advice during a lively discussion of this subject. But uncertainty
-remains concerning the price of copyright in a digital medium, because a
-solution remains to be worked out concerning management and synthesis of
-copyrighted and out-of-copyright pieces of a database.
-
-As moderator of the final session of the Workshop, Prosser GIFFORD directed
-discussion to future courses of action and the potential role of LC in
-advancing them. Among the recommendations that emerged were the following:
-
- * Workshop participants should 1) begin to think about working
- with image material, but structure and digitize it in such a
- way that at a later stage it can be interpreted into text, and
- 2) find a common way to build text and images together so that
- they can be used jointly at some stage in the future, with
- appropriate network support, because that is how users will want
- to access these materials. The Library might encourage attempts
- to bring together people who are working on texts and images.
-
- * A network version of American Memory should be developed or
- consideration should be given to making the data in it
- available to people interested in doing network multimedia.
- Given the current dearth of digital data that is appealing and
- unencumbered by extremely complex rights problems, developing a
- network version of American Memory could do much to help make
- network multimedia a reality.
-
- * Concerning the thorny issue of electronic deposit, LC should
- initiate a catalytic process in terms of distributed
- responsibility, that is, bring together the distributed
- organizations and set up a study group to look at all the
- issues related to electronic deposit and see where we as a
- nation should move. For example, LC might attempt to persuade
- one major library in each state to deal with its state
- equivalent publisher, which might produce a cooperative project
- that would be equitably distributed around the country, and one
- in which LC would be dealing with a minimal number of publishers
- and minimal copyright problems. LC must also deal with the
- concept of on-line publishing, determining, among other things,
- how serials such as OJCCT might be deposited for copyright.
-
- * Since a number of projects are planning to carry out
- preservation by creating digital images that will end up in
- on-line or near-line storage at some institution, LC might play
- a helpful role, at least in the near term, by accelerating how
- to catalog that information into the Research Library Information
- Network (RLIN) and then into OCLC, so that it would be accessible.
- This would reduce the possibility of multiple institutions digitizing
- the same work.
-
-
-CONCLUSION
-
-The Workshop was valuable because it brought together partisans from
-various groups and provided an occasion to compare goals and methods.
-The more committed partisans frequently communicate with others in their
-groups, but less often across group boundaries. The Workshop was also
-valuable to attendees--including those involved with American Memory--who
-came less committed to particular approaches or concepts. These
-attendees learned a great deal, and plan to select and employ elements of
-imaging, text-coding, and networked distribution that suit their
-respective projects and purposes.
-
-Still, reality rears its ugly head: no breakthrough has been achieved.
-On the imaging side, one confronts a proliferation of competing
-data-interchange standards and a lack of consensus on the role of digital
-facsimiles in preservation. In the realm of machine-readable texts, one
-encounters a reasonably mature standard but methodological difficulties
-and high costs. These latter problems, of course, represent a special
-impediment to the desire, as it is sometimes expressed in the popular
-press, "to put the [contents of the] Library of Congress on line." In
-the words of one participant, there was "no solution to the economic
-problems--the projects that are out there are surviving, but it is going
-to be a lot of work to transform the information industry, and so far the
-investment to do that is not forthcoming" (LESK, per litteras).
-
-
- *** *** *** ****** *** *** ***
-
-
- PROCEEDINGS
-
-
-WELCOME
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-GIFFORD * Origin of Workshop in current Librarian's desire to make LC's
-collections more widely available * Desiderata arising from the prospect
-of greater interconnectedness *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-After welcoming participants on behalf of the Library of Congress,
-American Memory (AM), and the National Demonstration Lab, Prosser
-GIFFORD, director for scholarly programs, Library of Congress, located
-the origin of the Workshop on Electronic Texts in a conversation he had
-had considerably more than a year ago with Carl FLEISCHHAUER concerning
-some of the issues faced by AM. On the assumption that numerous other
-people were asking the same questions, the decision was made to bring
-together as many of these people as possible to ask the same questions
-together. In a deeper sense, GIFFORD said, the origin of the Workshop
-lay in the desire of the current Librarian of Congress, James H.
-Billington, to make the collections of the Library, especially those
-offering unique or unusual testimony on aspects of the American
-experience, available to a much wider circle of users than those few
-people who can come to Washington to use them. This meant that the
-emphasis of AM, from the outset, has been on archival collections of the
-basic material, and on making these collections themselves available,
-rather than selected or heavily edited products.
-
-From AM's emphasis followed the questions with which the Workshop began:
-who will use these materials, and in what form will they wish to use
-them. But an even larger issue deserving mention, in GIFFORD's view, was
-the phenomenal growth in Internet connectivity. He expressed the hope
-that the prospect of greater interconnectedness than ever before would
-lead to: 1) much more cooperative and mutually supportive endeavors; 2)
-development of systems of shared and distributed responsibilities to
-avoid duplication and to ensure accuracy and preservation of unique
-materials; and 3) agreement on the necessary standards and development of
-the appropriate directories and indices to make navigation
-straightforward among the varied resources that are, and increasingly
-will be, available. In this connection, GIFFORD requested that
-participants reflect from the outset upon the sorts of outcomes they
-thought the Workshop might have. Did those present constitute a group
-with sufficient common interests to propose a next step or next steps,
-and if so, what might those be? They would return to these questions the
-following afternoon.
-
- ******
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-FLEISCHHAUER * Core of Workshop concerns preparation and production of
-materials * Special challenge in conversion of textual materials *
-Quality versus quantity * Do the several groups represented share common
-interests? *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Carl FLEISCHHAUER, coordinator, American Memory, Library of Congress,
-emphasized that he would attempt to represent the people who perform some
-of the work of converting or preparing materials and that the core of
-the Workshop had to do with preparation and production. FLEISCHHAUER
-then drew a distinction between the long term, when many things would be
-available and connected in the ways that GIFFORD described, and the short
-term, in which AM not only has wrestled with the issue of what is the
-best course to pursue but also has faced a variety of technical
-challenges.
-
-FLEISCHHAUER remarked AM's endeavors to deal with a wide range of library
-formats, such as motion picture collections, sound-recording collections,
-and pictorial collections of various sorts, especially collections of
-photographs. In the course of these efforts, AM kept coming back to
-textual materials--manuscripts or rare printed matter, bound materials,
-etc. Text posed the greatest conversion challenge of all. Thus, the
-genesis of the Workshop, which reflects the problems faced by AM. These
-problems include physical problems. For example, those in the library
-and archive business deal with collections made up of fragile and rare
-manuscript items, bound materials, especially the notoriously brittle
-bound materials of the late nineteenth century. These are precious
-cultural artifacts, however, as well as interesting sources of
-information, and LC desires to retain and conserve them. AM needs to
-handle things without damaging them. Guillotining a book to run it
-through a sheet feeder must be avoided at all costs.
-
-Beyond physical problems, issues pertaining to quality arose. For
-example, the desire to provide users with a searchable text is affected
-by the question of acceptable level of accuracy. One hundred percent
-accuracy is tremendously expensive. On the other hand, the output of
-optical character recognition (OCR) can be tremendously inaccurate.
-Although AM has attempted to find a middle ground, uncertainty persists
-as to whether or not it has discovered the right solution.
-
-Questions of quality arose concerning images as well. FLEISCHHAUER
-contrasted the extremely high level of quality of the digital images in
-the Cornell Xerox Project with AM's efforts to provide a browse-quality
-or access-quality image, as opposed to an archival or preservation image.
-FLEISCHHAUER therefore welcomed the opportunity to compare notes.
-
-FLEISCHHAUER observed in passing that conversations he had had about
-networks have begun to signal that for various forms of media a
-determination may be made that there is a browse-quality item, or a
-distribution-and-access-quality item that may coexist in some systems
-with a higher quality archival item that would be inconvenient to send
-through the network because of its size. FLEISCHHAUER referred, of
-course, to images more than to searchable text.
-
-As AM considered those questions, several conceptual issues arose: ought
-AM occasionally to reproduce materials entirely through an image set, at
-other times, entirely through a text set, and in some cases, a mix?
-There probably would be times when the historical authenticity of an
-artifact would require that its image be used. An image might be
-desirable as a recourse for users if one could not provide 100-percent
-accurate text. Again, AM wondered, as a practical matter, if a
-distinction could be drawn between rare printed matter that might exist
-in multiple collections--that is, in ten or fifteen libraries. In such
-cases, the need for perfect reproduction would be less than for unique
-items. Implicit in his remarks, FLEISCHHAUER conceded, was the admission
-that AM has been tilting strongly towards quantity and drawing back a
-little from perfect quality. That is, it seemed to AM that society would
-be better served if more things were distributed by LC--even if they were
-not quite perfect--than if fewer things, perfectly represented, were
-distributed. This was stated as a proposition to be tested, with
-responses to be gathered from users.
-
-In thinking about issues related to reproduction of materials and seeing
-other people engaged in parallel activities, AM deemed it useful to
-convene a conference. Hence, the Workshop. FLEISCHHAUER thereupon
-surveyed the several groups represented: 1) the world of images (image
-users and image makers); 2) the world of text and scholarship and, within
-this group, those concerned with language--FLEISCHHAUER confessed to finding
-delightful irony in the fact that some of the most advanced thinkers on
-computerized texts are those dealing with ancient Greek and Roman materials;
-3) the network world; and 4) the general world of library science, which
-includes people interested in preservation and cataloging.
-
-FLEISCHHAUER concluded his remarks with special thanks to the David and
-Lucile Packard Foundation for its support of the meeting, the American
-Memory group, the Office for Scholarly Programs, the National
-Demonstration Lab, and the Office of Special Events. He expressed the
-hope that David Woodley Packard might be able to attend, noting that
-Packard's work and the work of the foundation had sponsored a number of
-projects in the text area.
-
- ******
-
-SESSION I. CONTENT IN A NEW FORM: WHO WILL USE IT AND WHAT WILL THEY DO?
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-DALY * Acknowledgements * A new Latin authors disk * Effects of the new
-technology on previous methods of research *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Serving as moderator, James DALY acknowledged the generosity of all the
-presenters for giving of their time, counsel, and patience in planning
-the Workshop, as well as of members of the American Memory project and
-other Library of Congress staff, and the David and Lucile Packard
-Foundation and its executive director, Colburn S. Wilbur.
-
-DALY then recounted his visit in March to the Center for Electronic Texts
-in the Humanities (CETH) and the Department of Classics at Rutgers
-University, where an old friend, Lowell Edmunds, introduced him to the
-department's IBYCUS scholarly personal computer, and, in particular, the
-new Latin CD-ROM, containing, among other things, almost all classical
-Latin literary texts through A.D. 200. Packard Humanities Institute
-(PHI), Los Altos, California, released this disk late in 1991, with a
-nominal triennial licensing fee.
-
-Playing with the disk for an hour or so at Rutgers brought home to DALY
-at once the revolutionizing impact of the new technology on his previous
-methods of research. Had this disk been available two or three years
-earlier, DALY contended, when he was engaged in preparing a commentary on
-Book 10 of Virgil's Aeneid for Cambridge University Press, he would not
-have required a forty-eight-square-foot table on which to spread the
-numerous, most frequently consulted items, including some ten or twelve
-concordances to key Latin authors, an almost equal number of lexica to
-authors who lacked concordances, and where either lexica or concordances
-were lacking, numerous editions of authors antedating and postdating Virgil.
-
-Nor, when checking each of the average six to seven words contained in
-the Virgilian hexameter for its usage elsewhere in Virgil's works or
-other Latin authors, would DALY have had to maintain the laborious
-mechanical process of flipping through these concordances, lexica, and
-editions each time. Nor would he have had to frequent as often the
-Milton S. Eisenhower Library at the Johns Hopkins University to consult
-the Thesaurus Linguae Latinae. Instead of devoting countless hours, or
-the bulk of his research time, to gathering data concerning Virgil's use
-of words, DALY--now freed by PHI's Latin authors disk from the
-tyrannical, yet in some ways paradoxically happy scholarly drudgery--
-would have been able to devote that same bulk of time to analyzing and
-interpreting Virgilian verbal usage.
-
-Citing Theodore Brunner, Gregory Crane, Elli MYLONAS, and Avra MICHELSON,
-DALY argued that this reversal in his style of work, made possible by the
-new technology, would perhaps have resulted in better, more productive
-research. Indeed, even in the course of his browsing the Latin authors
-disk at Rutgers, its powerful search, retrieval, and highlighting
-capabilities suggested to him several new avenues of research into
-Virgil's use of sound effects. This anecdotal account, DALY maintained,
-may serve to illustrate in part the sudden and radical transformation
-being wrought in the ways scholars work.
-
- ******
-
-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-MICHELSON * Elements related to scholarship and technology * Electronic
-texts within the context of broader trends within information technology
-and scholarly communication * Evaluation of the prospects for the use of
-electronic texts * Relationship of electronic texts to processes of
-scholarly communication in humanities research * New exchange formats
-created by scholars * Projects initiated to increase scholarly access to
-converted text * Trend toward making electronic resources available
-through research and education networks * Changes taking place in
-scholarly communication among humanities scholars * Network-mediated
-scholarship transforming traditional scholarly practices * Key
-information technology trends affecting the conduct of scholarly
-communication over the next decade * The trend toward end-user computing
-* The trend toward greater connectivity * Effects of these trends * Key
-transformations taking place * Summary of principal arguments *
-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Avra MICHELSON, Archival Research and Evaluation Staff, National Archives
-and Records Administration (NARA), argued that establishing who will use
-electronic texts and what they will use them for involves a consideration
-of both information technology and scholarship trends. This
-consideration includes several elements related to scholarship and
-technology: 1) the key trends in information technology that are most
-relevant to scholarship; 2) the key trends in the use of currently
-available technology by scholars in the nonscientific community; and 3)
-the relationship between these two very distinct but interrelated trends.
-The investment in understanding this relationship being made by
-information providers, technologists, and public policy developers, as
-well as by scholars themselves, seems to be pervasive and growing,
-MICHELSON contended. She drew on collaborative work with Jeff Rothenberg
-on the scholarly use of technology.
-
-MICHELSON sought to place the phenomenon of electronic texts within the
-context of broader trends within information technology and scholarly
-communication. She argued that electronic texts are of most use to
-researchers to the extent that the researchers' working context (i.e.,
-their relevant bibliographic sources, collegial feedback, analytic tools,
-notes, drafts, etc.), along with their field's primary and secondary
-sources, also is accessible in electronic form and can be integrated in
-ways that are unique to the on-line environment.
-
-Evaluation of the prospects for the use of electronic texts includes two
-elements: 1) an examination of the ways in which researchers currently
-are using electronic texts along with other electronic resources, and 2)
-an analysis of key information technology trends that are affecting the
-long-term conduct of scholarly communication. MICHELSON limited her
-discussion of the use of electronic texts to the practices of humanists
-and noted that the scientific community was outside the panel's overview.
-
-MICHELSON examined the nature of the current relationship of electronic
-texts in particular, and electronic resources in general, to what she
-maintained were, essentially, five processes of scholarly communication
-in humanities research. Researchers 1) identify sources, 2) communicate
-with their colleagues, 3) interpret and analyze data, 4) disseminate
-their research findings, and 5) prepare curricula to instruct the next
-generation of scholars and students. This examination would produce a
-clearer understanding of the synergy among these five processes that
-fuels the tendency of the use of electronic resources for one process to
-stimulate its use for other processes of scholarly communication.
-
-For the first process of scholarly communication, the identification of
-sources, MICHELSON remarked the opportunity scholars now enjoy to
-supplement traditional word-of-mouth searches for sources among their
-colleagues with new forms of electronic searching. So, for example,
-instead of having to visit the library, researchers are able to explore
-descriptions of holdings in their offices. Furthermore, if their own
-institutions' holdings prove insufficient, scholars can access more than
-200 major American library catalogues over Internet, including the
-universities of California, Michigan, Pennsylvania, and Wisconsin.
-Direct access to the bibliographic databases offers intellectual
-empowerment to scholars by presenting a comprehensive means of browsing
-through libraries from their homes and offices at their convenience.
-
-The second process of communication involves communication among
-scholars. Beyond the most common methods of communication, scholars are
-using E-mail and a variety of new electronic communications formats
-derived from it for further academic interchange. E-mail exchanges are
-growing at an astonishing rate, reportedly 15 percent a month. They
-currently constitute approximately half the traffic on research and
-education networks. Moreover, the global spread of E-mail has been so
-rapid that it is now possible for American scholars to use it to
-communicate with colleagues in close to 140 other countries.
-
-Other new exchange formats created by scholars and operating on Internet
-include more than 700 conferences, with about 80 percent of these devoted
-to topics in the social sciences and humanities. The rate of growth of
-these scholarly electronic conferences also is astonishing. From l990 to
-l991, 200 new conferences were identified on Internet. From October 1991
-to June 1992, an additional 150 conferences in the social sciences and
-humanities were added to this directory of listings. Scholars have
-established conferences in virtually every field, within every different
-discipline. For example, there are currently close to 600 active social
-science and humanities conferences on topics such as art and
-architecture, ethnomusicology, folklore, Japanese culture, medical
-education, and gifted and talented education. The appeal to scholars of
-communicating through these conferences is that, unlike any other medium,
-electronic conferences today provide a forum for global communication
-with peers at the front end of the research process.
-
-Interpretation and analysis of sources constitutes the third process of
-scholarly communication that MICHELSON discussed in terms of texts and
-textual resources. The methods used to analyze sources fall somewhere on
-a continuum from quantitative analysis to qualitative analysis.
-Typically, evidence is culled and evaluated using methods drawn from both
-ends of this continuum. At one end, quantitative analysis involves the
-use of mathematical processes such as a count of frequencies and
-distributions of occurrences or, on a higher level, regression analysis.
-At the other end of the continuum, qualitative analysis typically
-involves nonmathematical processes oriented toward language
-interpretation or the building of theory. Aspects of this work involve
-the processing--either manual or computational--of large and sometimes
-massive amounts of textual sources, although the use of nontextual
-sources as evidence, such as photographs, sound recordings, film footage,
-and artifacts, is significant as well.
-
-Scholars have discovered that many of the methods of interpretation and
-analysis that are related to both quantitative and qualitative methods
-are processes that can be performed by computers. For example, computers
-can count. They can count brush strokes used in a Rembrandt painting or
-perform regression analysis for understanding cause and effect. By means
-of advanced technologies, computers can recognize patterns, analyze text,
-and model concepts. Furthermore, computers can complete these processes
-faster with more sources and with greater precision than scholars who
-must rely on manual interpretation of data. But if scholars are to use
-computers for these processes, source materials must be in a form
-amenable to computer-assisted analysis. For this reason many scholars,
-once they have identified the sources that are key to their research, are
-converting them to machine-readable form. Thus, a representative example
-of the numerous textual conversion projects organized by scholars around
-the world in recent years to support computational text analysis is the
-TLG, the Thesaurus Linguae Graecae. This project is devoted to
-converting the extant ancient texts of classical Greece. (Editor's note:
-according to the TLG Newsletter of May l992, TLG was in use in thirty-two
-different countries. This figure updates MICHELSON's previous count by one.)
-
-The scholars performing these conversions have been asked to recognize
-that the electronic sources they are converting for one use possess value
-for other research purposes as well. As a result, during the past few
-years, humanities scholars have initiated a number of projects to
-increase scholarly access to converted text. So, for example, the Text
-Encoding Initiative (TEI), about which more is said later in the program,
-was established as an effort by scholars to determine standard elements
-and methods for encoding machine-readable text for electronic exchange.
-In a second effort to facilitate the sharing of converted text, scholars
-have created a new institution, the Center for Electronic Texts in the
-Humanities (CETH). The center estimates that there are 8,000 series of
-source texts in the humanities that have been converted to
-machine-readable form worldwide. CETH is undertaking an international
-search for converted text in the humanities, compiling it into an
-electronic library, and preparing bibliographic descriptions of the
-sources for the Research Libraries Information Network's (RLIN)
-machine-readable data file. The library profession has begun to initiate
-large conversion projects as well, such as American Memory.
-
-While scholars have been making converted text available to one another,
-typically on disk or on CD-ROM, the clear trend is toward making these
-resources available through research and education networks. Thus, the
-American and French Research on the Treasury of the French Language
-(ARTFL) and the Dante Project are already available on Internet.
-MICHELSON summarized this section on interpretation and analysis by
-noting that: 1) increasing numbers of humanities scholars in the library
-community are recognizing the importance to the advancement of
-scholarship of retrospective conversion of source materials in the arts
-and humanities; and 2) there is a growing realization that making the
-sources available on research and education networks maximizes their
-usefulness for the analysis performed by humanities scholars.
-
-The fourth process of scholarly communication is dissemination of
-research findings, that is, publication. Scholars are using existing
-research and education networks to engineer a new type of publication:
-scholarly-controlled journals that are electronically produced and
-disseminated. Although such journals are still emerging as a
-communication format, their number has grown, from approximately twelve
-to thirty-six during the past year (July 1991 to June 1992). Most of
-these electronic scholarly journals are devoted to topics in the
-humanities. As with network conferences, scholarly enthusiasm for these
-electronic journals stems from the medium's unique ability to advance
-scholarship in a way that no other medium can do by supporting global
-feedback and interchange, practically in real time, early in the research
-process. Beyond scholarly journals, MICHELSON remarked the delivery of
-commercial full-text products, such as articles in professional journals,
-newsletters, magazines, wire services, and reference sources. These are
-being delivered via on-line local library catalogues, especially through
-CD-ROMs. Furthermore, according to MICHELSON, there is general optimism
-that the copyright and fees issues impeding the delivery of full text on
-existing research and education networks soon will be resolved.
-
-The final process of scholarly communication is curriculum development
-and instruction, and this involves the use of computer information
-technologies in two areas. The first is the development of
-computer-oriented instructional tools, which includes simulations,
-multimedia applications, and computer tools that are used to assist in
-the analysis of sources in the classroom, etc. The Perseus Project, a
-database that provides a multimedia curriculum on classical Greek
-civilization, is a good example of the way in which entire curricula are
-being recast using information technologies. It is anticipated that the
-current difficulty in exchanging electronically computer-based
-instructional software, which in turn makes it difficult for one scholar
-to build upon the work of others, will be resolved before too long.
-Stand-alone curricular applications that involve electronic text will be
-sharable through networks, reinforcing their significance as intellectual
-products as well as instructional tools.
-
-The second aspect of electronic learning involves the use of research and
-education networks for distance education programs. Such programs
-interactively link teachers with students in geographically scattered
-locations and rely on the availability of electronic instructional
-resources. Distance education programs are gaining wide appeal among
-state departments of education because of their demonstrated capacity to
-bring advanced specialized course work and an array of experts to many
-classrooms. A recent report found that at least 32 states operated at
-least one statewide network for education in 1991, with networks under
-development in many of the remaining states.
-
-MICHELSON summarized this section by noting two striking changes taking
-place in scholarly communication among humanities scholars. First is the
-extent to which electronic text in particular, and electronic resources
-in general, are being infused into each of the five processes described
-above. As mentioned earlier, there is a certain synergy at work here.
-The use of electronic resources for one process tends to stimulate its
-use for other processes, because the chief course of movement is toward a
-comprehensive on-line working context for humanities scholars that
-includes on-line availability of key bibliographies, scholarly feedback,
-sources, analytical tools, and publications. MICHELSON noted further
-that the movement toward a comprehensive on-line working context for
-humanities scholars is not new. In fact, it has been underway for more
-than forty years in the humanities, since Father Roberto Busa began
-developing an electronic concordance of the works of Saint Thomas Aquinas
-in 1949. What we are witnessing today, MICHELSON contended, is not the
-beginning of this on-line transition but, for at least some humanities
-scholars, the turning point in the transition from a print to an
-electronic working context. Coinciding with the on-line transition, the
-second striking change is the extent to which research and education
-networks are becoming the new medium of scholarly communication. The
-existing Internet and the pending National Education and Research Network
-(NREN) represent the new meeting ground where scholars are going for
-bibliographic information, scholarly dialogue and feedback, the most
-current publications in their field, and high-level educational
-offerings. Traditional scholarly practices are undergoing tremendous
-transformations as a result of the emergence and growing prominence of
-what is called network-mediated scholarship.
-
-MICHELSON next turned to the second element of the framework she proposed
-at the outset of her talk for evaluating the prospects for electronic
-text, namely the key information technology trends affecting the conduct
-of scholarly communication over the next decade: 1) end-user computing
-and 2) connectivity.
-
-End-user computing means that the person touching the keyboard, or
-performing computations, is the same as the person who initiates or
-consumes the computation. The emergence of personal computers, along
-with a host of other forces, such as ubiquitous computing, advances in
-interface design, and the on-line transition, is prompting the consumers
-of computation to do their own computing, and is thus rendering obsolete
-the traditional distinction between end users and ultimate users.
-
-The trend toward end-user computing is significant to consideration of
-the prospects for electronic texts because it means that researchers are
-becoming more adept at doing their own computations and, thus, more
-competent in the use of electronic media. By avoiding programmer
-intermediaries, computation is becoming central to the researcher's
-thought process. This direct involvement in computing is changing the
-researcher's perspective on the nature of research itself, that is, the
-kinds of questions that can be posed, the analytical methodologies that
-can be used, the types and amount of sources that are appropriate for
-analyses, and the form in which findings are presented. The trend toward
-end-user computing means that, increasingly, electronic media and
-computation are being infused into all processes of humanities
-scholarship, inspiring remarkable transformations in scholarly
-communication.
-
-The trend toward greater connectivity suggests that researchers are using
-computation increasingly in network environments. Connectivity is
-important to scholarship because it erases the distance that separates
-students from teachers and scholars from their colleagues, while allowing
-users to access remote databases, share information in many different
-media, connect to their working context wherever they are, and
-collaborate in all phases of research.
-
-The combination of the trend toward end-user computing and the trend
-toward connectivity suggests that the scholarly use of electronic
-resources, already evident among some researchers, will soon become an
-established feature of scholarship. The effects of these trends, along
-with ongoing changes in scholarly practices, point to a future in which
-humanities researchers will use computation and electronic communication
-to help them formulate ideas, access sources, perform research,
-collaborate with colleagues, seek peer review, publish and disseminate
-results, and engage in many other professional and educational activities.
-
-In summary, MICHELSON emphasized four points: 1) A portion of humanities
-scholars already consider electronic texts the preferred format for
-analysis and dissemination. 2) Scholars are using these electronic
-texts, in conjunction with other electronic resources, in all the
-processes of scholarly communication. 3) The humanities scholars'
-working context is in the process of changing from print technology to
-electronic technology, in many ways mirroring transformations that have
-occurred or are occurring within the scientific community. 4) These
-changes are occurring in conjunction with the development of a new
-communication medium: research and education networks that are
-characterized by their capacity to advance scholarship in a wholly unique
-way.
-
-MICHELSON also reiterated her three principal arguments: l) Electronic
-texts are best understood in terms of the relationship to other
-electronic resources and the growing prominence of network-mediated
-scholarship. 2) The prospects for electronic texts lie in their capacity
-to be integrated into the on-line network of electronic resources that
-comprise the new working context for scholars. 3) Retrospective conversion
-of portions of the scholarly record should be a key strategy as information
-providers respond to changes in scholarly communication practices.
-
- ******
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-VECCIA * AM's evaluation project and public users of electronic resources
-* AM and its design * Site selection and evaluating the Macintosh
-implementation of AM * Characteristics of the six public libraries
-selected * Characteristics of AM's users in these libraries * Principal
-ways AM is being used *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Susan VECCIA, team leader, and Joanne FREEMAN, associate coordinator,
-American Memory, Library of Congress, gave a joint presentation. First,
-by way of introduction, VECCIA explained her and FREEMAN's roles in
-American Memory (AM). Serving principally as an observer, VECCIA has
-assisted with the evaluation project of AM, placing AM collections in a
-variety of different sites around the country and helping to organize and
-implement that project. FREEMAN has been an associate coordinator of AM
-and has been involved principally with the interpretative materials,
-preparing some of the electronic exhibits and printed historical
-information that accompanies AM and that is requested by users. VECCIA
-and FREEMAN shared anecdotal observations concerning AM with public users
-of electronic resources. Notwithstanding a fairly structured evaluation
-in progress, both VECCIA and FREEMAN chose not to report on specifics in
-terms of numbers, etc., because they felt it was too early in the
-evaluation project to do so.
-
-AM is an electronic archive of primary source materials from the Library
-of Congress, selected collections representing a variety of formats--
-photographs, graphic arts, recorded sound, motion pictures, broadsides,
-and soon, pamphlets and books. In terms of the design of this system,
-the interpretative exhibits have been kept separate from the primary
-resources, with good reason. Accompanying this collection are printed
-documentation and user guides, as well as guides that FREEMAN prepared for
-teachers so that they may begin using the content of the system at once.
-
-VECCIA described the evaluation project before talking about the public
-users of AM, limiting her remarks to public libraries, because FREEMAN
-would talk more specifically about schools from kindergarten to twelfth
-grade (K-12). Having started in spring 1991, the evaluation currently
-involves testing of the Macintosh implementation of AM. Since the
-primary goal of this evaluation is to determine the most appropriate
-audience or audiences for AM, very different sites were selected. This
-makes evaluation difficult because of the varying degrees of technology
-literacy among the sites. AM is situated in forty-four locations, of
-which six are public libraries and sixteen are schools. Represented
-among the schools are elementary, junior high, and high schools.
-District offices also are involved in the evaluation, which will
-conclude in summer 1993.
-
-VECCIA focused the remainder of her talk on the six public libraries, one
-of which doubles as a state library. They represent a range of
-geographic areas and a range of demographic characteristics. For
-example, three are located in urban settings, two in rural settings, and
-one in a suburban setting. A range of technical expertise is to be found
-among these facilities as well. For example, one is an "Apple library of
-the future," while two others are rural one-room libraries--in one, AM
-sits at the front desk next to a tractor manual.
-
-All public libraries have been extremely enthusiastic, supportive, and
-appreciative of the work that AM has been doing. VECCIA characterized
-various users: Most users in public libraries describe themselves as
-general readers; of the students who use AM in the public libraries,
-those in fourth grade and above seem most interested. Public libraries
-in rural sites tend to attract retired people, who have been highly
-receptive to AM. Users tend to fall into two additional categories:
-people interested in the content and historical connotations of these
-primary resources, and those fascinated by the technology. The format
-receiving the most comments has been motion pictures. The adult users in
-public libraries are more comfortable with IBM computers, whereas young
-people seem comfortable with either IBM or Macintosh, although most of
-them seem to come from a Macintosh background. This same tendency is
-found in the schools.
-
-What kinds of things do users do with AM? In a public library there are
-two main goals or ways that AM is being used: as an individual learning
-tool, and as a leisure activity. Adult learning was one area that VECCIA
-would highlight as a possible application for a tool such as AM. She
-described a patron of a rural public library who comes in every day on
-his lunch hour and literally reads AM, methodically going through the
-collection image by image. At the end of his hour he makes an electronic
-bookmark, puts it in his pocket, and returns to work. The next day he
-comes in and resumes where he left off. Interestingly, this man had
-never been in the library before he used AM. In another small, rural
-library, the coordinator reports that AM is a popular activity for some
-of the older, retired people in the community, who ordinarily would not
-use "those things,"--computers. Another example of adult learning in
-public libraries is book groups, one of which, in particular, is using AM
-as part of its reading on industrialization, integration, and urbanization
-in the early 1900s.
-
-One library reports that a family is using AM to help educate their
-children. In another instance, individuals from a local museum came in
-to use AM to prepare an exhibit on toys of the past. These two examples
-emphasize the mission of the public library as a cultural institution,
-reaching out to people who do not have the same resources available to
-those who live in a metropolitan area or have access to a major library.
-One rural library reports that junior high school students in large
-numbers came in one afternoon to use AM for entertainment. A number of
-public libraries reported great interest among postcard collectors in the
-Detroit collection, which was essentially a collection of images used on
-postcards around the turn of the century. Train buffs are similarly
-interested because that was a time of great interest in railroading.
-People, it was found, relate to things that they know of firsthand. For
-example, in both rural public libraries where AM was made available,
-observers reported that the older people with personal remembrances of
-the turn of the century were gravitating to the Detroit collection.
-These examples served to underscore MICHELSON's observation re the
-integration of electronic tools and ideas--that people learn best when
-the material relates to something they know.
-
-VECCIA made the final point that in many cases AM serves as a
-public-relations tool for the public libraries that are testing it. In
-one case, AM is being used as a vehicle to secure additional funding for
-the library. In another case, AM has served as an inspiration to the
-staff of a major local public library in the South to think about ways to
-make its own collection of photographs more accessible to the public.
-
- ******
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-FREEMAN * AM and archival electronic resources in a school environment *
-Questions concerning context * Questions concerning the electronic format
-itself * Computer anxiety * Access and availability of the system *
-Hardware * Strengths gained through the use of archival resources in
-schools *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Reiterating an observation made by VECCIA, that AM is an archival
-resource made up of primary materials with very little interpretation,
-FREEMAN stated that the project has attempted to bridge the gap between
-these bare primary materials and a school environment, and in that cause
-has created guided introductions to AM collections. Loud demand from the
-educational community, chiefly from teachers working with the upper
-grades of elementary school through high school, greeted the announcement
-that AM would be tested around the country.
-
-FREEMAN reported not only on what was learned about AM in a school
-environment, but also on several universal questions that were raised
-concerning archival electronic resources in schools. She discussed
-several strengths of this type of material in a school environment as
-opposed to a highly structured resource that offers a limited number of
-paths to follow.
-
-FREEMAN first raised several questions about using AM in a school
-environment. There is often some difficulty in developing a sense of
-what the system contains. Many students sit down at a computer resource
-and assume that, because AM comes from the Library of Congress, all of
-American history is now at their fingertips. As a result of that sort of
-mistaken judgment, some students are known to conclude that AM contains
-nothing of use to them when they look for one or two things and do not
-find them. It is difficult to discover that middle ground where one has
-a sense of what the system contains. Some students grope toward the idea
-of an archive, a new idea to them, since they have not previously
-experienced what it means to have access to a vast body of somewhat
-random information.
-
-Other questions raised by FREEMAN concerned the electronic format itself.
-For instance, in a school environment it is often difficult both for
-teachers and students to gain a sense of what it is they are viewing.
-They understand that it is a visual image, but they do not necessarily
-know that it is a postcard from the turn of the century, a panoramic
-photograph, or even machine-readable text of an eighteenth-century
-broadside, a twentieth-century printed book, or a nineteenth-century
-diary. That distinction is often difficult for people in a school
-environment to grasp. Because of that, it occasionally becomes difficult
-to draw conclusions from what one is viewing.
-
-FREEMAN also noted the obvious fear of the computer, which constitutes a
-difficulty in using an electronic resource. Though students in general
-did not suffer from this anxiety, several older students feared that they
-were computer-illiterate, an assumption that became self-fulfilling when
-they searched for something but failed to find it. FREEMAN said she
-believed that some teachers also fear computer resources, because they
-believe they lack complete control. FREEMAN related the example of
-teachers shooing away students because it was not their time to use the
-system. This was a case in which the situation had to be extremely
-structured so that the teachers would not feel that they had lost their
-grasp on what the system contained.
-
-A final question raised by FREEMAN concerned access and availability of
-the system. She noted the occasional existence of a gap in communication
-between school librarians and teachers. Often AM sits in a school
-library and the librarian is the person responsible for monitoring the
-system. Teachers do not always take into their world new library
-resources about which the librarian is excited. Indeed, at the sites
-where AM had been used most effectively within a library, the librarian
-was required to go to specific teachers and instruct them in its use. As
-a result, several AM sites will have in-service sessions over a summer,
-in the hope that perhaps, with a more individualized link, teachers will
-be more likely to use the resource.
-
-A related issue in the school context concerned the number of
-workstations available at any one location. Centralization of equipment
-at the district level, with teachers invited to download things and walk
-away with them, proved unsuccessful because the hours these offices were
-open were also school hours.
-
-Another issue was hardware. As VECCIA observed, a range of sites exists,
-some technologically advanced and others essentially acquiring their
-first computer for the primary purpose of using it in conjunction with
-AM's testing. Users at technologically sophisticated sites want even
-more sophisticated hardware, so that they can perform even more
-sophisticated tasks with the materials in AM. But once they acquire a
-newer piece of hardware, they must learn how to use that also; at an
-unsophisticated site it takes an extremely long time simply to become
-accustomed to the computer, not to mention the program offered with the
-computer. All of these small issues raise one large question, namely,
-are systems like AM truly rewarding in a school environment, or do they
-simply act as innovative toys that do little more than spark interest?
-
-FREEMAN contended that the evaluation project has revealed several strengths
-that were gained through the use of archival resources in schools, including:
-
- * Psychic rewards from using AM as a vast, rich database, with
- teachers assigning various projects to students--oral presentations,
- written reports, a documentary, a turn-of-the-century newspaper--
- projects that start with the materials in AM but are completed using
- other resources; AM thus is used as a research tool in conjunction
- with other electronic resources, as well as with books and items in
- the library where the system is set up.
-
- * Students are acquiring computer literacy in a humanities context.
-
- * This sort of system is overcoming the isolation between disciplines
- that often exists in schools. For example, many English teachers are
- requiring their students to write papers on historical topics
- represented in AM. Numerous teachers have reported that their
- students are learning critical thinking skills using the system.
-
- * On a broader level, AM is introducing primary materials, not only
- to students but also to teachers, in an environment where often
- simply none exist--an exciting thing for the students because it
- helps them learn to conduct research, to interpret, and to draw
- their own conclusions. In learning to conduct research and what it
- means, students are motivated to seek knowledge. That relates to
- another positive outcome--a high level of personal involvement of
- students with the materials in this system and greater motivation to
- conduct their own research and draw their own conclusions.
-
- * Perhaps the most ironic strength of these kinds of archival
- electronic resources is that many of the teachers AM interviewed
- were desperate, it is no exaggeration to say, not only for primary
- materials but for unstructured primary materials. These would, they
- thought, foster personally motivated research, exploration, and
- excitement in their students. Indeed, these materials have done
- just that. Ironically, however, this lack of structure produces
- some of the confusion to which the newness of these kinds of
- resources may also contribute. The key to effective use of archival
- products in a school environment is a clear, effective introduction
- to the system and to what it contains.
-
- ******
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-DISCUSSION * Nothing known, quantitatively, about the number of
-humanities scholars who must see the original versus those who would
-settle for an edited transcript, or about the ways in which humanities
-scholars are using information technology * Firm conclusions concerning
-the manner and extent of the use of supporting materials in print
-provided by AM to await completion of evaluative study * A listener's
-reflections on additional applications of electronic texts * Role of
-electronic resources in teaching elementary research skills to students *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-During the discussion that followed the presentations by MICHELSON,
-VECCIA, and FREEMAN, additional points emerged.
-
-LESK asked if MICHELSON could give any quantitative estimate of the
-number of humanities scholars who must see or want to see the original,
-or the best possible version of the material, versus those who typically
-would settle for an edited transcript. While unable to provide a figure,
-she offered her impressions as an archivist who has done some reference
-work and has discussed this issue with other archivists who perform
-reference, that those who use archives and those who use primary sources
-for what would be considered very high-level scholarly research, as
-opposed to, say, undergraduate papers, were few in number, especially
-given the public interest in using primary sources to conduct
-genealogical or avocational research and the kind of professional
-research done by people in private industry or the federal government.
-More important in MICHELSON's view was that, quantitatively, nothing is
-known about the ways in which, for example, humanities scholars are using
-information technology. No studies exist to offer guidance in creating
-strategies. The most recent study was conducted in 1985 by the American
-Council of Learned Societies (ACLS), and what it showed was that 50
-percent of humanities scholars at that time were using computers. That
-constitutes the extent of our knowledge.
-
-Concerning AM's strategy for orienting people toward the scope of
-electronic resources, FREEMAN could offer no hard conclusions at this
-point, because she and her colleagues were still waiting to see,
-particularly in the schools, what has been made of their efforts. Within
-the system, however, AM has provided what are called electronic exhibits-
--such as introductions to time periods and materials--and these are
-intended to offer a student user a sense of what a broadside is and what
-it might tell her or him. But FREEMAN conceded that the project staff
-would have to talk with students next year, after teachers have had a
-summer to use the materials, and attempt to discover what the students
-were learning from the materials. In addition, FREEMAN described
-supporting materials in print provided by AM at the request of local
-teachers during a meeting held at LC. These included time lines,
-bibliographies, and other materials that could be reproduced on a
-photocopier in a classroom. Teachers could walk away with and use these,
-and in this way gain a better understanding of the contents. But again,
-reaching firm conclusions concerning the manner and extent of their use
-would have to wait until next year.
-
-As to the changes she saw occurring at the National Archives and Records
-Administration (NARA) as a result of the increasing emphasis on
-technology in scholarly research, MICHELSON stated that NARA at this
-point was absorbing the report by her and Jeff Rothenberg addressing
-strategies for the archival profession in general, although not for the
-National Archives specifically. NARA is just beginning to establish its
-role and what it can do. In terms of changes and initiatives that NARA
-can take, no clear response could be given at this time.
-
-GREENFIELD remarked two trends mentioned in the session. Reflecting on
-DALY's opening comments on how he could have used a Latin collection of
-text in an electronic form, he said that at first he thought most scholars
-would be unwilling to do that. But as he thought of that in terms of the
-original meaning of research--that is, having already mastered these texts,
-researching them for critical and comparative purposes--for the first time,
-the electronic format made a lot of sense. GREENFIELD could envision
-growing numbers of scholars learning the new technologies for that very
-aspect of their scholarship and for convenience's sake.
-
-Listening to VECCIA and FREEMAN, GREENFIELD thought of an additional
-application of electronic texts. He realized that AM could be used as a
-guide to lead someone to original sources. Students cannot be expected
-to have mastered these sources, things they have never known about
-before. Thus, AM is leading them, in theory, to a vast body of
-information and giving them a superficial overview of it, enabling them
-to select parts of it. GREENFIELD asked if any evidence exists that this
-resource will indeed teach the new user, the K-12 students, how to do
-research. Scholars already know how to do research and are applying
-these new tools. But he wondered why students would go beyond picking
-out things that were most exciting to them.
-
-FREEMAN conceded the correctness of GREENFIELD's observation as applied
-to a school environment. The risk is that a student would sit down at a
-system, play with it, find some things of interest, and then walk away.
-But in the relatively controlled situation of a school library, much will
-depend on the instructions a teacher or a librarian gives a student. She
-viewed the situation not as one of fine-tuning research skills but of
-involving students at a personal level in understanding and researching
-things. Given the guidance one can receive at school, it then becomes
-possible to teach elementary research skills to students, which in fact
-one particular librarian said she was teaching her fifth graders.
-FREEMAN concluded that introducing the idea of following one's own path
-of inquiry, which is essentially what research entails, involves more
-than teaching specific skills. To these comments VECCIA added the
-observation that the individual teacher and the use of a creative
-resource, rather than AM itself, seemed to make the key difference.
-Some schools and some teachers are making excellent use of the nature
-of critical thinking and teaching skills, she said.
-
-Concurring with these remarks, DALY closed the session with the thought that
-the more that producers produced for teachers and for scholars to use with
-their students, the more successful their electronic products would prove.
-
- ******
-
-SESSION II. SHOW AND TELL
-
-Jacqueline HESS, director, National Demonstration Laboratory, served as
-moderator of the "show-and-tell" session. She noted that a
-question-and-answer period would follow each presentation.
-
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-MYLONAS * Overview and content of Perseus * Perseus' primary materials
-exist in a system-independent, archival form * A concession * Textual
-aspects of Perseus * Tools to use with the Greek text * Prepared indices
-and full-text searches in Perseus * English-Greek word search leads to
-close study of words and concepts * Navigating Perseus by tracing down
-indices * Using the iconography to perform research *
-+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-Elli MYLONAS, managing editor, Perseus Project, Harvard University, first
-gave an overview of Perseus, a large, collaborative effort based at
-Harvard University but with contributors and collaborators located at
-numerous universities and colleges in the United States (e.g., Bowdoin,
-Maryland, Pomona, Chicago, Virginia). Funded primarily by the
-Annenberg/CPB Project, with additional funding from Apple, Harvard, and
-the Packard Humanities Institute, among others, Perseus is a multimedia,
-hypertextual database for teaching and research on classical Greek
-civilization, which was released in February 1992 in version 1.0 and
-distributed by Yale University Press.
-
-Consisting entirely of primary materials, Perseus includes ancient Greek
-texts and translations of those texts; catalog entries--that is, museum
-catalog entries, not library catalog entries--on vases, sites, coins,
-sculpture, and archaeological objects; maps; and a dictionary, among
-other sources. The number of objects and the objects for which catalog
-entries exist are accompanied by thousands of color images, which
-constitute a major feature of the database. Perseus contains
-approximately 30 megabytes of text, an amount that will double in
-subsequent versions. In addition to these primary materials, the Perseus
-Project has been building tools for using them, making access and
-navigation easier, the goal being to build part of the electronic
-environment discussed earlier in the morning in which students or
-scholars can work with their sources.
-
-The demonstration of Perseus will show only a fraction of the real work
-that has gone into it, because the project had to face the dilemma of
-what to enter when putting something into machine-readable form: should
-one aim for very high quality or make concessions in order to get the
-material in? Since Perseus decided to opt for very high quality, all of
-its primary materials exist in a system-independent--insofar as it is
-possible to be system-independent--archival form. Deciding what that
-archival form would be and attaining it required much work and thought.
-For example, all the texts are marked up in SGML, which will be made
-compatible with the guidelines of the Text Encoding Initiative (TEI) when
-they are issued.
-
-Drawings are postscript files, not meeting international standards, but
-at least designed to go across platforms. Images, or rather the real
-archival forms, consist of the best available slides, which are being
-digitized. Much of the catalog material exists in database form--a form
-that the average user could use, manipulate, and display on a personal
-computer, but only at great cost. Thus, this is where the concession
-comes in: All of this rich, well-marked-up information is stripped of
-much of its content; the images are converted into bit-maps and the text
-into small formatted chunks. All this information can then be imported
-into HyperCard and run on a mid-range Macintosh, which is what Perseus
-users have. This fact has made it possible for Perseus to attain wide
-use fairly rapidly. Without those archival forms the HyperCard version
-being demonstrated could not be made easily, and the project could not
-have the potential to move to other forms and machines and software as
-they appear, none of which information is in Perseus on the CD.
-
-Of the numerous multimedia aspects of Perseus, MYLONAS focused on the
-textual. Part of what makes Perseus such a pleasure to use, MYLONAS
-said, is this effort at seamless integration and the ability to move
-around both visual and textual material. Perseus also made the decision
-not to attempt to interpret its material any more than one interprets by
-selecting. But, MYLONAS emphasized, Perseus is not courseware: No
-syllabus exists. There is no effort to define how one teaches a topic
-using Perseus, although the project may eventually collect papers by
-people who have used it to teach. Rather, Perseus aims to provide
-primary material in a kind of electronic library, an electronic sandbox,
-so to say, in which students and scholars who are working on this
-material can explore by themselves. With that, MYLONAS demonstrated
-Perseus, beginning with the Perseus gateway, the first thing one sees
-upon opening Perseus--an effort in part to solve the contextualizing
-problem--which tells the user what the system contains.
-
-MYLONAS demonstrated only a very small portion, beginning with primary
-texts and running off the CD-ROM. Having selected Aeschylus' Prometheus
-Bound, which was viewable in Greek and English pretty much in the same
-segments together, MYLONAS demonstrated tools to use with the Greek text,
-something not possible with a book: looking up the dictionary entry form
-of an unfamiliar word in Greek after subjecting it to Perseus'
-morphological analysis for all the texts. After finding out about a
-word, a user may then decide to see if it is used anywhere else in Greek.
-Because vast amounts of indexing support all of the primary material, one
-can find out where else all forms of a particular Greek word appear--
-often not a trivial matter because Greek is highly inflected. Further,
-since the story of Prometheus has to do with the origins of sacrifice, a
-user may wish to study and explore sacrifice in Greek literature; by
-typing sacrifice into a small window, a user goes to the English-Greek
-word list--something one cannot do without the computer (Perseus has
-indexed the definitions of its dictionary)--the string sacrifice appears
-in the definitions of these sixty-five words. One may then find out
-where any of those words is used in the work(s) of a particular author.
-The English definitions are not lemmatized.
-
-All of the indices driving this kind of usage were originally devised for
-speed, MYLONAS observed; in other words, all that kind of information--
-all forms of all words, where they exist, the dictionary form they belong
-to--were collected into databases, which will expedite searching. Then
-it was discovered that one can do things searching in these databases
-that could not be done searching in the full texts. Thus, although there
-are full-text searches in Perseus, much of the work is done behind the
-scenes, using prepared indices. Re the indexing that is done behind the
-scenes, MYLONAS pointed out that without the SGML forms of the text, it
-could not be done effectively. Much of this indexing is based on the
-structures that are made explicit by the SGML tagging.
-
-It was found that one of the things many of Perseus' non-Greek-reading
-users do is start from the dictionary and then move into the close study
-of words and concepts via this kind of English-Greek word search, by which
-means they might select a concept. This exercise has been assigned to
-students in core courses at Harvard--to study a concept by looking for the
-English word in the dictionary, finding the Greek words, and then finding
-the words in the Greek but, of course, reading across in the English.
-That tells them a great deal about what a translation means as well.
-
-Should one also wish to see images that have to do with sacrifice, that
-person would go to the object key word search, which allows one to
-perform a similar kind of index retrieval on the database of
-archaeological objects. Without words, pictures are useless; Perseus has
-not reached the point where it can do much with images that are not
-cataloged. Thus, although it is possible in Perseus with text and images
-to navigate by knowing where one wants to end up--for example, a
-red-figure vase from the Boston Museum of Fine Arts--one can perform this
-kind of navigation very easily by tracing down indices. MYLONAS
-illustrated several generic scenes of sacrifice on vases. The features
-demonstrated derived from Perseus 1.0; version 2.0 will implement even
-better means of retrieval.
-
-MYLONAS closed by looking at one of the pictures and noting again that
-one can do a great deal of research using the iconography as well as the
-texts. For instance, students in a
<TRUNCATED>