Archive for category libraries

[2b2k] Libraries are platforms?

I’m at the DPLA Plenary meeting, heading toward the first public presentation — a status report — on the prototype DPLA platform we’ve been building at Berkman and the Library Innovation Lab. So, tons of intellectual stimulation, as well as a fair bit of stress.

The platform we’ve been building is a software platform, i.e., a set of data and services offered through an API so that developers can use it to build end-user applications, and so other sites can integrate DPLA data into their sites. But I’ve been thinking for the past few weeks about ways in which libraries can (and perhaps should) view themselves as platforms in a broader sense. I want to write about this more, but here’s an initial set of draft-y thoughts about platforms as a way of framing the library issue.

Libraries are attached to communities, whether local towns, universities, or other institutions. Traditionally, much of their value has been in providing access to knowledge and cultural objects of particular sorts (you know, like books and stuff). Libraries thus have been platforms for knowledge and culture: they provide a reliable, open resource that enable knowledge and culture to be developed and pursued.

As the content of knowledge and culture change from physical to digital (over time and never completely), perhaps it’s helpful to think about libraries in their abstract sense as platforms. What might a library platform look like in the age of digital networks?(An hour later: Note that this type of platform would be very different from what we’re working on for the DPLA.)

It would give its community open access to the objects of knowledge and culture. It would include physical spaces as a particularly valuable sort of node. But the platform would do much more. If the mission is to help the community develop and pursue knowledge and culture, it would certainly provide tools and services that enable communities to form around these objects. The platform would make public the work of local creators, and would provide contexts within which these works can be found, discussed, elaborated, and appropriated. It would provide an ecosystem in which ideas and conversations flow out and in, weaving objects into local meanings and lives. Of course it would allow the local culture to flourish while simultaneously connecting it with the rest of the world — ideally by beginning with linking it into other local library platforms.

This is obviously not a well-worked out idea. It also contains nothing that hasn’t been discussed for decades now. What I like about it (at least for now) is that a platform provides a positive metaphor for thinking about the value of libraries that both helps explain their traditional value, and their opportunity facing the future.

DPLA session beginning. Will post without rereading… (Hat tip to Tim O’Reilly who has been talking about government as a platform for a few years now.) (Later: Also, my friend and DPLA colleague Nate Hill blogged a couple of months ago about libraries as local publishing platforms.)

Tags:

Library News

Did I ever mention the really useful site Matt Phillips and Jeff Goldenson at the Library Innovation Lab put up a couple of weeks ago? If you are interested in libraries and tech, Library News is a community-supported news site where you’ll find a steady stream of interesting articles. Or, put differently, it’s the Hacker News code redirected at library tech articles.

I have it open all day. Try it. Contribute to it. Go library hacker nuts!

Tags:

Physical libraries in a digital world

I’m at the final meeting of a Harvard course on the future of libraries, led by John Palfrey and Jeffrey Schnapp. They have three guests in to talk about physical library space.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

David Lamberth lays out an idea as a provocation. He begins by pointing out that until the beginning of the 20th century, a library was not a place but only a collection of books. He gives a quick history of Harvard Library. After the library burned down in 1764, the libraries lived in fear of fire, until electric lights came in. The replacement library (Gore Hall) was built out of stone because brick structures need wood on the inside. But stone structures are dank, and many books had to be re-bound every 30 years. Once it filled up, 25-30 of Harvard libraries derived from the search for fireproof buildings, which helps explain the large distribution of libraries across campus. They also developed more than 40 different classification systems. At the beginning of the 20th C, Harvard’s collection was just over one million. Now it adds up to around 18M. [David's presentation was not choppy, the way this paraphrase is.]

In the 1980s, there was continuing debate about what to do about the need for space. The big issue was open or closed stacks. The faculty wanted the books on site so they could be browsed. But stack space is expensive and you tend to outgrow it faster than you think. So, it was decided not to build any more stack space. There already was an offsite repository (New England Book Depository), but it was decided to build a high density storage facility to remove the non-active parts of the collection to a cheaper, off-site space: The Harvard Depository (HD).

Now more than 40% of the physical collections are at HD. The Faculty of Arts and Sciences started out hostile to the idea, but “soon became converted.” The notion faculty had of browsing the shelves was based on a fantasy: Harvard had never had all the books on a subject on a shelf in a single facility. E.g., search on “Shakespeare” in the Harvard library system: 18,000 hits. Widener Library is where you’d expect to find Shakespeare books. But 8,000 of the volumes aren’t in Widener. Of Widener’s 10K Shakespeare, volumes, 4,500 are in HD. So, 25% of what you meant to browse is there. “Shelf browsing is a waste of time” if you’re trying to do thorough research. It’s a little better in the smaller libraries, but the future is not in shelf browsing. Open and closed stacks isn’t the question any more. “It’s just not possible any longer to do shelf browsing, unless we develop tools for browsing in a non-physical fashion.” E.g., catalog browsers, and ShelfLife (with StackView).

There’s nobody in the stacks any more. “It’s like the zombies have come and cleared people out.” People have new alternatives, and new habits. “But we have real challenges making sure they do as thorough research as possible, and that we leverage our collection.” About 12M of the 18M items are barcoded.

A task force saw that within 40 years, over 70% of the physical collection will be off site. HD was not designed to hold the part of the collection most people want to use. So, what can do that will give us pedagogical and intellectual benefit, and realizes the incredible resource that our collection is?

Let me present one idea, says David. The Library Task Force said emphatically that Harvard’s collection should be seen as one collection. It makes sense intellectually and financially. But that idea is in contention with the 56 physical libraries at Harvard. Also, most of our collection doesn’t circulate. Only some of it is digitally browsable, and some of that won’t change for a long long long time. E.g., our Arabic journals in Widener aren’t indexed, don’t publish cumulative indexes, and are very hard to index. Thus scholars need to be able to pull them off the shelves. Likewise for big collections of manuscripts that haven’t even been sorted yet.

One idea would be to say: Let’s treat physical libraries as one place as well. Think of them as contiguous, even though they’re not. What if bar-coded books stayed in the library you returned to them to? Not shelved by a taxonomy. Random access via the digital, and it tells you where the work is. And build perfect shelves for the works that need to be physically organized. Let’s build perfect Shakespeare shelves. Put them in one building. The other less-used works will be findable, but not browsable. This would require investing in better findability systems, but it would let us get past the arbitrariness of classification systems. Already David will usually go to Amazon to decide if he wants a book rather than take the 5 mins to walk to the library. By focusing on perfect shelves for what is most important to be browsable, resources would be freed up. This might make more space in the physical libraries, so “we could think about what the people in those buildings want to be doing,” so people would come in because there’s more going on. (David notes that this model will not go over well with many of his colleagues.)

53% of library space at Harvard is stack space. The other 47% is split between patron space and space staff. About 20-25% is space staff. Comparatively, Harvard is lower on patron space size than typical. The HD is holding half the collection in 20% of the space. It’s 4x as expensive to store a work on a stack on campus than off.

David responds to a question: The perfect shelves should be dynamic, not permanent. That will better serve the evolution of research. There are independent variables: Classification and shelf location. We certainly need classification, but it may not need to map to shelf locations. Widener has bibliographic lists and shelf lists. Barcodes give us more freedom; we don’t have to constantly return works to fixed locations.

Mike Barker: Students already build their own perfect shelves with carrels.

Q: What’s the case for ownership and retention if we’re only addressing temporal faculty needs?

A lot of the collecting in the first half of the 20 C was driven by faculty requests. Not now. The question of retention and purchase splits on the basis of how uncommon the piece of info is. If it’s being sold by Amazon, I don’t think it really matters if we retain it, because of the number of copies and the archival steps already in place. The more rare the work, the more we should think about purchase and retention. But under a third of the stack space on campus ideal environmental conditions. We shouldn’t put works we buy into those circumstances unless they’re being used.

Q: At the Law Library, we’re trying to spread it out so that not everyone is buying the same stuff. E.g., we buy Peruvian materials because other libraries aren’t. And many law books are not available digitally, so we we buy them … but we only buy one copy.

Yes, you’re making an assessment. In the Divinity library, Mike looked at the duplication rate. It was 53%. That is, 53% of our works are duplicated in other Harvard libraries.

Mike: How much do we spend on classification? To create call numbers? We annually spend about 1.5-2M on it, plus another million shelving it. So, $3M-3.5M total. (Mike warns that this is a “very squishy” number.) We circulate about 700,000 items a years. The total operating budget of the Library is about $152M. (He derived this number by asking catalogers who long it takes to classify an item without one, divided into salary.)

David: Scanning in tables of contents, indexes, etc., lets people find things without having to anticipate what they’re going to be interested in.

Q: Where does serendipity fall in this? What about when you don’t know what you’re looking for?

David: I agree completely. My dissertation depended on a book that no one had checked out since 1910. I found it on the stacks. But it’s not on the shelves now. Suppose I could ask a research librarian to bring me two shelves worth of stuff because I’m beginning to explore some area.

Q: What you’re suggesting won’t work so well for students. How would not having stacks affect students?

David: I’m being provocative but concrete. The status quo is not delivering what we think it does, and it hasn’t for the past three decades.

Q: [jeff goldenson] Public librarians tell us that the recently returned trucks are the most interesting place to go. We don’t really have the ability to see what’s moving in the Harvard system. Yes, there are privacy concerns, but just showing what books have been returned would be great.

Q: [palfrey] How much does the rise of the digital affect this idea? Also, you’ve said that the storage cost of a digital object may be more than that of physical objects. How does that affect this idea?

David: Copyright law is the big If. It’s not going away. But what kind of access do you have to digital objects that you own? That’s a huge variable. I’ve premised much of what I’ve said on the working notion that we will continue to build physical collections. We don’t know how much it will cost to keep a physical object for a long time. And computer scientists all say that digital objects are not durable. My working notion here is that the parts that are really crucial are the metadata pieces, which are more easily re-buildable if you have the physical objects. We’re not going to buy physical objects for all the digital items, so the selection principle goes back to how grey or black the items are. It depends on whether we get past the engineering question about digital durability — which depends a lot on electromagnetism as a storage medium, which may be a flash in the pan. We’re moving incrementally.

Q: [me] If we can identify the high value works that go on perfect shelves, why not just skip the physical shelves and increase the amount of metadata so that people can browse them looking for the sort of info they get from going to the physical shelf?

A: David: Money. We can’t spend too much on the present at the expense of the next century or two. There’s a threshold where you’d say that it’s worth digitizing them to the degree you’d need to replace physical inspection entirely. It’s a considered judgment, which we make, for example, when we decide to digitize exhibitions. You’d want to look at the opportunity costs.

David suggests that maybe the Divinity library (he’s in the Phil Dept.) should remove some stacks to make space for in-stack work and discussion areas. (He stresses that he’s just thinking out loud.)

Matthew Sheehy, who runs HD, says they’re thinking about how to keep books 500 years. They spend $300K/year on electricity to create the right environment. They’ve invested in redundancy. But, the walls of the HD will only last 100 years. [Nov. 25: I may have gotten the following wrong:] He thinks it costs about $1/ year to store a book, not the usual figure of $0.45.

Jeffrey Schnapp: We’re building a library test kitchen. We’re interested in building physical shelves that have digital lives as well.

[Nov. 25: Changed Philosophy school to Divinity, in order to make it correct. Switched the remark about the cost of physical vs. digital in the interest of truth.]

Tags:

[avignon] [2b2k] Robert Darnton on the history of copyright , open access, the dpla…

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

We begin with a report on a Ministerial meeting yesterday here on culture — a dialogue among the stakeholders on the Internet. [No users included, I believe.] All agreed on the principles proposed at Deauville: It is a multi-stakeholder ecosystem that complies with law. In this morning’s discussion, I was struck by the convergence: we all agree about remunerating copyright holders. [Selection effect. I favor copyright and remunerating rights holders, but not as the supreme or exclusive value.] We agree that there are more legal alternatives. We agree that the law needs to be enforced. No one argued with that. [At what cost?] And we all agree we need international cooperation, especially to fight piracy.

Now Robert Darnton, Harvard Librarian, gives an invited talk about the history of copyright.

Darnton: I am grateful to be here. And especially grateful you did not ask me to talk about the death of the book. The book is not dead. More books are being produced in print and online every year than in the previous year. This year, more than 1 million new books will be produced. China has doubled its production of books in the past ten years. Brazil has a booming book industry. Even old countries like the US find book production is increasing. We should not bemoan the death of the book.

Should we conclude that all is well in the world of books? Certainly not. Listen to the lamentations of authors, publishers, booksellers. They are clearly frightened and confused. The ground is shifting beneath their feet and they don’t know where to stake a claim. The pace of tech is terrifying. What took millennia, then centuries, then decades, now happens all the time. Homesteading in the new info ecology is made difficult by uncertainty about copyright and economics.

Throughout early modern Europe, publishing was dominated by guilds of booksellers and printers. Modern copyright did not exist, but booksellers accumulated privileges, which Condorcet objected to. These privileges (AKA patents) gave them the exclusive rights to reproduce texts, with the support of the state. The monarchy in the 17th century eliminated competitors, especially ones in the provinces, reinforcing the guild, thus gaining control of publishing. But illegal production throve. Avignon was a great center of privacy in the 18th century because it was not French. It was surrounded by police intercepting the illegal books. It took a revolution to break the hegemony of the Parisian guild. For two years after the Bastille, the French press enjoyed liberty. Condorcet and others had argued for the abolition of constraints on the free exchange of ideas. It was a utopian vision that didn’t last long.

Modern copyright began with the 1793 French copyright law that established a new model in Europe. The exclusive right to sell a text was limited to the author for lifetime + 10 years. Meanwhile, the British Statute of Anne in 1710 created copyright. Background: The stationers’ monopoly required booksellers — and all had to be members — to register. The oligarchs of the guild crushed their competitors through monopolies. They were so powerful that they provoked results even within the book trade. Parliament rejected the guild’s attempt to secure the licensing act in 1695. The British celebrate this as the beginning of the end of pre-publication censorship.

The booksellers lobbied for the modern concept of copyright. For new works: 14 years, renewable once. At its origin, copyright law tried to strike a balance between the public good and the private benefit of the copyright owner. According to a liberal view, Parliament got the balance right. But the publishers refused to comply, invoking a general principle inherent in common law: When an author creates work, he acquires an unlimited right to profit from his labor. If he sold it, the publisher owned it in perpetuity. This was Diderot’s position. The same argument occurred in France and England.

In England, the argument culminated in a 1774 Donaldson vs. Beckett that reaffirmed 14 years renewable once. Then we Americans followed in our Constitution and in the first copyright law in 1790 (“An act for the encouragement of learning”, echoing the British 1710 Act): 14 years renewable once.

The debate is still alive. The 1998 copyright extension act in the US was considerably shaped by Jack Valenti and the Hollywood lobby. It extended copyright to life + 70 (or for corporations: life + 95). We are thus putting most literature out of the public domain and into copyright that seems perpetual. Valenti was asked if he favored perpetual copyright and said “No. Copyright should last forever minus one day.”

This history is meant to emphasize the interplay of two elements that go right through the copyright debate: A principle directed toward the public gain vs. self-interest for private gain. It would be wrong-headed and naive to only assert the former. B ut to assert only the latter would be cynical. So, do we have the balance right today?

Consider knowledge and power. We all agree that patents help, but no one would want the knowledge of DNA to be exploited as private property. The privitization of knowledge has become an enclosure movement. Consider academic periodicals. Most knowledge first appears in digitized periodicals. The journal article is the principle outlet for the sciences, law, philosophy, etc. Journal publishers therefore control access to most of the knowledge being created, and they charge a fortune. The price of academic journals rose ten times faster than the rate of inflation in the 1990s. The J of Comparative Neurology is $29,113/year. The Brain costs $23,000. The average list price in chemistry is over $3,000. Most of the research was subsidized by tax payers. It belongs in the public domain. But commercial publishers have fenced off parts of that domain and exploited it. Their profit margins runs as high as 40%. Why aren’t they constrained by the laws of supply and domain? Because they have crowded competitors out, and the demand is not elastic: Research libraries cannot cancel their subscriptions without an uproar from the faculty. Of course, professors and students produced the research and provided it for free to the publishers. Academics are therefore complicit. They advance their prestige by publishing in journals, but they fail to understand the damage they’re doing to the Republic of Letters.

How to reverse this trend? Open access journals. Journals that are subsidized at the production end and are made free to consumers. They get more readers, too, which is not surprising since search engines index them and it’s easy for readers to get to them. Open Access is easy access, and the ease has economic consequences. Doctors, journalists, researchers, housewives, nearly everyone wants information fast and costless. Open Access is the answer. It is a little simple, but it’s the direction we have to take to address this problem at least in academic journals.

But the Forum is thinking about other things. I admire Google for its technical prowess, but also because it demonstrated that free access to info can be profitable. But it ran into problems when it began to digitize books and make them available. It got sued for alleged breach of copyright. It tried to settle by turning it into a gigantic business and sharing the profits with the authors and publishers who sued them. Libraries had provided the books. Now they’d have to buy them back at a price set by Google. Google was fencing off access to knowledge. A federal judge rejected it because, among other points, it threatened to create a monopoly. By controlling access to books, Google occupied a position similar to that of the guilds in London and Paris.

So why not create a library as great as anything imagined by Google, but that would make works available to users free of charge? Harvard held a workshop on Oct. 1 2010 to explore this. Like Condorcet, a utopian fantasy? But it turns out to be eminently reasonable. A steering committee, a secretariat, 6 workgroups were established. A year later we launched the Digital Public Library of America at a conference hosted by the major cultural institutions in DC, and in April in 2013 we’ll have a preliminary version of it.

Let me emphasize two points. 1. The DPLA will serve a wide an varied constituency throughout the US. It will be a force in education, and will provide a stimulus to the economy by putting knowledge to work. 2. It will spread to everyone on the globe. The DPLA’s technical infrastructure is being designed to be interoperable with Europeana, which is aggregating the digital collections of 27 companies. National digital libraries are sprouting up everywhere, even Mongolia. We need to bring them together. Books have never respected boundaries. Within a few decades, we’ll have worldwide access to all the books in the world, and images, recordings, films, etc.

Of course a lot remains to be done. But, the book is dead? Long live the book!

Q: It is patronizing to think that the USA and Europe will set the policy here. India and China will set this policy.

A: We need international collaboration. And we need an infrastructure that is interoperable.

Tags:

[2b2k] Interview with Kevin Kelly on What Libraries Want

Dan Jones just posted my Library Lab Podcast conversation with Kevin Kelly, of whom I’m a great admirer.

Tags:

[2b2k] Will digital scholarship ever keep up?

Scott F. Johnson has posted a dystopic provocation about the present of digital scholarship and possibly about its future.

Here’s the crux of his argument:

… as the deluge of information increases at a very fast pace — including both the digitization of scholarly materials unavailable in digital form previously and the new production of journals and books in digital form — and as the tools that scholars use to sift, sort, and search this material are increasingly unable to keep up — either by being limited in terms of the sheer amount of data they can deal with, or in terms of becoming so complex in terms of usability that the average scholar can’t use it — then the less likely it will be that a scholar can adequately cover the research material and write a convincing scholarly narrative today.

Thus, I would argue that in the future, when the computational tools (whatever they may be) eventually develop to a point of dealing profitably with the new deluge of digital scholarship, the backward-looking view of scholarship in our current transitional period may be generally disparaging. It may be so disparaging, in fact, that the scholarship of our generation will be seen as not trustworthy, or inherently compromised in some way by comparison with what came before (pre-digital) and what will come after (sophisticatedly digital).

Scott tentatively concludes:

For the moment one solution is to read less, but better. This may seem a luddite approach to the problem, but what other choice is there?

First, I should point out that the rest of Scott’s post makes it clear that he’s no Luddite. He understands the advantages of digital scholarship. But I look at this a little differently.

I agree with most of Scott’s description of the current state of digital scholarship and with the inevitability of an ever increasing deluge of scholarly digital material. But, I think the issue is not that the filters won’t be able to keep up with the deluge. Rather, I think we’re just going to have to give up on the idea of “keeping up” — much as newspapers and half hour news broadcasts have to give up the pretense that they are covering all the day’s events. The idea of coverage was always an internalization of the limitation of the old media, as if a newspaper, a broadcast, or even the lifetime of a scholar could embrace everything important there is to know about a field. Now the Net has made clear to us what we knew all along: most of what knowledge wanted to do was a mere dream.

So, for me the question is what scholarship and expertise look like when they cannot attain a sense of mastery by artificial limiting the material with which they have to deal. It was much easier when you only had to read at the pace of the publishers. Now you’d have to read at the pace of the writers…and there are so many more writers! So, lacking a canon, how can there be experts? How can you be a scholar?

I’m bad at predicting the future, and I don’t know if Scott is right that we will eventually develop such powerful search and filtering tools that the current generation of scholars will look betwixt-and-between fools (or as an “asterisk,” as Scott says). There’s an argument that even if the pace of growth slows, the pace of complexification will increase. In any case, I’d guess that deep scholars will continue to exist because that’s more a personality trait than a function of the available materials. For example, I’m currently reading Armies of Heaven, by Jay Rubenstein. The depth of his knowledge about the First Crusade is astounding. Astounding. As more of the works he consulted come on line, other scholars of similar temperament will find it easier to pursue their deep scholarship. They will read less and better not as a tactic but because that’s how the world beckons to them. But the Net will also support scholars who want to read faster and do more connecting. Finally (and to me most interestingly) the Net is already helping us to address the scaling problem by facilitating the move of knowledge from books to networks. Books don’t scale. Networks do. Although, yes, that fundamentally changes the nature of knowledge and scholarship.

[Note: My initial post embedded one draft inside another and was a total mess. Ack. I've cleaned it up - Oct. 26, 2011, 4:03pm edt.]

Tags:

[2b2k] Bookbinding and the Digital Bible

Avi Solomon at BoingBoing has a terrific interview with Michael Greer about the appeal of bookbinding, and about Michael’s “Digital Bible.”

I love the photo:

Digital Bible: Book with ones and zeroes as text

Tags: