Archive for category libraries

[liveblog] The future of libraries

I’m at a Hubweek event called “Libraries: The Next Generation.” It’s a panel hosted by the Berkman Center with Dan Cohen, the executive director of the DPLA; Andromeda Yelton, a developer who has done work with libraries; and Jeffrey Schnapp of metaLab

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Sue Kriegsman of the Center introduces the session by explaining Berkman’s interest in libraries. “We have libraries lurking in every corner…which is fabulous.” Also, Berkman incubated the DPLA. And it has other projects underway.

Dan Cohen speaks first. He says if he were to give a State of the Union Address about libraries, he’d say: “They are as beloved as ever and stand at the center of communities” here and around the world. He cites a recent Pew survey about perspectives on libraries:“ …libraries have the highest approval rating of all American institutions. But, that’s fragile.” libraries have the highest approval rating of all American institutions. But, he warns, that’s fragile. There are many pressures, and libraries are chronically under-funded, which is hard to understand given how beloved they are.

First among the pressures on libraries: the move from print. E-book adoption hasn’t stalled, although the purchase of e-books from the Big Five publishers compared to print has slowed. But Overdrive is lending lots of ebooks. Amazon has 65% of the ebook market, “a scary number,” Dan says. In the Pew survey a couple of weeks ago, 35% said that libraries ought to spend more on ebooks even at the expense of physical books. But 20% thought the opposite. That makes it hard to be the director of a public library.

If you look at the ebook market, there’s more reading go on at places like the DPLA. (He mentions the StackLife browser they use, that came out of the Harvard Library Innovation Lab that I used to co-direct.) Many of the ebooks are being provided straight to a platform (mainly Amazon) by the authors.

There are lots of jobs public libraries do that are unrelated to books. E.g., the Boston Public Library is heavily used by the homeless population.

The way forward? Dan stresses working together, collaboration. “DPLA is as much a social, collaborative project as it is a technical project.” It is run by a community that has gotten together to run a common platform.

And digital is important. We don’t want to leave it to Jeff Bezos who “wants to drop anything on you that you want, by drone, in an hour.”

Andromeda: She says she’s going to talk about “libraries beyond Thunderdome,” echoing a phrase from Sue Kriegman’s opening comments. “My real concern is with the skills of the people surrounding our crashed Boeing.” Libraries need better skills to evaluate and build the software they need. She gives some exxamples of places where we see a tensions between library values and code.

1. The tension between access and privacy. Physical books leave no traces. With ebooks the reading is generally tracked. Overdrive did a deal so that library patrons who access ebooks get notices from Amazon when their loan period is almost up. Adobe does rights management, with reports coming page by page about what people are reading. “Unencrypted over the Internet,” she adds. “You need a fair bit of technical knowledge to see that this is happening,” she says. “It doesn’t have to be this way.” “It’s the DRM and the technology that have these privacy issues built in.”

She points to the NYPL Library Simplified program that makes it far easier for non-techie users. It includes access to Project Gutenberg. Libraries have an incentive to build open architectures that support privacy. But they need the funding and the technical resources.

She cites the Library Freedom Project that teaches librarians about anti-surveillance technologies. They let library users browse the Internet through TOR, preventing (or at least greatly inhibit) tracking. They set up the first library TOR node in New Hampshire. Homeland Security quickly suggested that they stop. But there was picketing against this, and the library turned it back on. “That makes me happy.”

2. Metadata. She has us do an image search for “beautiful woman” at Google. They’re basically all white. Metadata is sometimes political. She goes through the 200s of the Dewey Decimal system: 90% Christian. “This isn’t representative of human knowledge. It’s representative of what Melvil Dewey thought maps to human knowledge.” Libraries make certain viewpoints more computationally accessible than others.“ Our ability to write new apps is only as good as the metadata under them.” Our ability to write new apps is only as good as the metadata under them. “As we go on to a more computational library world — which is awesome — we’re going to fossilize all these old prejudices. That’s my fear.”

“My hope is that we’ll have the support, conviction and empathy to write software, and to demand software, that makes our libraries better, and more fair.”

Jeffrey: He says his peculiar interest is in how we use space to build libraries as architectures of knowledge. “Libraries are one of our most ancient institutions.” “Libraries have constantly undergone change,” from mausoleums, to cloisters, to warehouses, places of curatorial practice, and civic spaces. “The legacy of that history…has traces of all of those historical identities.” We’ve always faced the question “What is a library?” What are it’s services? How does it serve its customers? Architects and designers have responded to this, assuming a set of social needs, opportunities, fantasies, and the practices by which knowledge is created, refined, shared. “These are all abiding questions.”

Contemporary architects and designers are often excited by library projects because it crystallizes one of the most central questions of the day: “How do you weave together information and space?” We’re often not very good at that. The default for libraries has been: build a black box.

We have tended to associate libraries with collections. “If you ask what is a library?, the first answer you get is: a collection.” But libraries have also always been about the making of connections, i.e., how the collections are brought alive. E.g., the Alexandrian Librarywas a performance space. “What does this connection space look like today?” In his book with Matthew Battles, they argue that while we’ve thought of libraries as being a single institution, in fact today there are now many different types of libraries. E.g., the research library as an information space seems to be collapsing; the researchers don’t need reading rooms, etc. But civic libraries are expanding their physical practices.

We need to be talking about many different types of libraries, each with their own services and needs. The Library as an institution is on the wane. We need to proliferate and multiply the libraries to serve their communities and to take advantage of the new tools and services. “We need spaces for learning,” but the stack is just one model.


Dan: Mike O’Malley says that our image of reading is in a salon with a glass of port, but in grad school we’re taught to read a book the way a sous chef guts a fish. A study says that of academic ebooks, 75% of scholars read less than 50 pages of them. [I may have gotten that slightly wrong. Sorry.] Assuming a proliferation of forms, what can we do to address them?

Jeffrey: The presuppositions about how we package knowledge are all up for grabs now. “There’s a vast proliferation of channels. ‘And that’s a design opportunity.’”There’s a vast proliferation of channels. “And that’s a design opportunity.” How can we create audiences that would never have been part of the traditional distribution models? “I’m really excited about getting scholars and creative practitioners involved in short-form knowledge and the spectrum of ways you can intersect” the different ways we use these different forms. “That includes print.” There’s “an extraordinary explosion of innovation around print.”

Andromeda: “Reading is a shorthand. Library is really about transforming people and one another by providing access to information.” Reading is not the only way of doing this. E.g., in maker spaces people learn by using their hands. “How can you support reading as a mode of knowledge construction?” Ten years ago she toured Olin College library, which was just starting. The library had chairs and whiteboards on castors. “This is how engineers think”: they want to be able to configure a space on the fly, and have toys for fidgeting. E.g., her eight year old has to be standing and moving if she’s asked a hard question. “We need to think of reading as something broader than dealing with a text in front of you.”

Jeffrey: The DPLA has a location in the name — America &#8212. The French National Library wants to collect “the French Internet.” But what does that mean? The Net seems to be beyond locality. What role does place play?

Dan: From the beginning we’ve partnered with Europeana. We reused Europeana’s metadata standard, enabling us to share items. E.g., Europeana’s 100th anniversary of the Great War web site was able to seamlessly pull in content from the DPLA via our API, and from other countries. “The DPLA has materials in over 400 languages,” and actively partners with other international libraries.

Dan points to Amy Ryan (the DPLA chairperson, who is in the audience) and points to the construction of glass walls to see into the Boston Public Library. This increases “permeability.” When she was head of the BPL, she lowered the stacks on the second floor so now you can see across the entire floor. Permeability “is a very smart architecture” for both physical and digital spaces.

Jeff: Rendering visible a lot of the invisible stuff that libraries do is “super-rich,” assuming the privacy concerns are addressed.

Andromeda: Is there scope in the DPLA metadata for users to address the inevitable imbalances in the metadata?

Dan: We collect data from 1,600 different sources. We normalize the data, which is essential if you want to enable it for collaboration. Our Metdata Application Profile v. 4 adds a field for annotation. Because we’re only a dozen people, we haven’t created a crowd-sourcing tool, but all our data is CC0 (public domain) so anyone who wants to can create a tool for metadata enhancement. If people do enhance it, though, we’ll have to figure out if we import that data into the DPLA.

Jeffrey: The politics of metadata and taxonomy has a long history. The Enlightenment fantasy is for a universal metadata school. What does the future look like on this issue?

Andromeda: “You can have extremely crowdsourced metadata, but then you’re subject to astroturfing”You can have extremely crowdsourced metadata, but then you’re subject to astroturfing and popularity boosting results for bad reasons. There isn’t a great solution except insofar as you provide frameworks for data that enable many points of view and actively solicit people to express themselves. But I don’t have a solution.

Dan: E.g., at DPLA there are lots of ways entering dates. We don’t want to force a scheme down anyone’s throat. But the tension between crowdsourced and more professional curation is real. The Indianapolis Museum of Art allowed freeform tagging and compared the crowdsourced tags vs. professional. Crowdsourced: “sea” and “orange” were big, which curators generally don’t use.


Q: People structure knowledge differently. My son has ADHD. Or Nepal, where I visited recently.

A: Dan: It’s great that the digital can be reformatted for devices but also for other cultural views. “That’s one of the miraculous things about the digital.” E.g., digital book shelves like StackLife can reorder themselves depending on the query.

Jeff: Yes, these differences can be profound. “Designing for that is a challenge but really exciting.”

Andromeda: This is a why it’s so important to talk with lots of people and to enable them collaborate.

me: Linked data seems to resolve some of these problems with metadata.

Dan: Linked Data provides a common reference for entities. Allows harmonizing data. The DPLA has a slot for such IDs (which are URIs). We’re getting there, but it’s not our immediate priority. [Blogger’s perogative: By having many references for an item linked via “sameAs” relationships can help get past the prejudice that can manifest itself when there’s a single canonical reference link. But mainly I mean that because Linked Data doesn’t have a single record for each item, new relationships can be added relatively easily.]

Q; How do business and industry influence libraries? E.g., Google has images for every place in the world. They have scanned books. “I can see a triangulation happening. Virtual libraries? Virtual spaces?

Andromeda: (1) Virtual tech is written outside of libraries, almost entirely. So it depends on what libraries are able to demand and influence. (2) Commercial tech sets expectations for what users experiences should be like, which libraries may not be able to support. (3) “People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can.”People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can. Libraries should up their tech game, but there’s an existential threat.

Jeffrey: People use other spaces to connect to knowledge, e.g. coffee houses, which are now being incorporated into libraries. Some people are anxious about that loss of boundary. Being able to eat, drink, and talk is a strong “vision statement” but for some it breaks down the world of contemplative knowledge they want from a library.

Q: The National Science and Technology Library in China last week said they have the right to preserve all electronic resources. How can we do that?

Dan: Libraries have long been sites for preservation. In the 21st century we’re so focused on getting access now now now, we lose sight that we may be buying into commercial systems that may not be able to preserve this. This is the main problem with DRM. Libraries are in the forever business, but we don’t know where Amazon will be. We don’t know if we’ll be able to read today’s books on tomorrow devices. E.g., “I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. ”I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. Open Access advocacy is going to play a critical role. Sure, Google is a $300B business and they’ll stick around, but they drop services. They don’t have a commitment like libraries and nonprofits and universities do to being in the forever business.

Jeff: It’s a huge question. It’s really important to remember that the oldest digital documents we have are 50 yrs old which isn’t even a drop in the bucket. There’s far from universal agreement about the preservation formats. Old web sites, old projects, chunks of knowledge, of mine have disappeared. What does it mean to preserve a virtual world? We need open standards, and practices [missed the word] “Digital stuff is inherently fragile.”

Andromeda: There are some good things going on in this space. The Rapid Response Social Media project is archiving (e.g., #Ferguson). Preserving software is hard: you need the software system, the hardware, etc.

Q: Distintermediation has stripped out too much value. What are your thoughts on the future of curation?

Jeffrey: There’s a high level of anxiety in the librarian community about their future roles. But I think their role comes away as reinforced. It requires new skills, though.

Andromeda: In one pottery class the assignment was to make one pot. In another, it was to make 50 pots. The best pots came out of the latter. When lots of people can author lots of stuff, it’s great. That makes curation all the more critical.

Dan: the DPLA has a Curation Core: librarians helping us organize our ebook collection for kids, which we’re about to launch with President Obama. Also: Given the growth in authorship, yes, a lot of it is Sexy Vampires, but even with that aside, we’ll need librarians to sort through that.

Q: How will Digital Rights Management and copyright issues affect ebooks and libraries? How do you negotiate that or reform that?

Dan: It’s hard to accession a lot of things now. For many ebooks there’s no way to extract them from their DRM and they won’t move into the public domain for well over 100 years. To preserve things like that you have to break the law — some scholars have asked the Library of Congress for exemptions to the DMCA to archive films before they decay.

Q: Lightning round: How do you get people and the culture engaged with public libraries?

Andromeda: Ask yourself: Who’s not here?

Jeffrey: Politicians.

Dan: Evangelism

The post [liveblog] The future of libraries appeared first on Joho the Blog.


[2b2k][liveblog] Wayne Wiegand: Libraries beyond information

Wayne Wiegand is giving the lunchtime talk at the Library History Seminar XIII at Simmons College. He’s talking about his new book Part of Our Lives: A People’s History of the American Public Library.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He introduces himself as a humanist, which brings with it a curiosity about what it means to be a human in the world. He is flawed, born into a flawed culture. He exercises his curiosity in the field of library history. [He’s also the author of the best biography of Melvil Dewey.]

People love libraries, he says, citing the Pew Internet 2013 survey that showed that almost all institutions except libraries and first responders have fallen in public esteem. His new book traces the history of the public library by listening to people who have used them since the middle of the 19th century, a bottom-up perspective. He did much of his research by searching newspaper archives, finding letters to the editors as well as articles. =People love their libraries for (1) the info they make accessible, (2) the public space, and (3) the stories they circulate that make sense of their world.

Thomas Edison spent as much time as possible in the library. The Wright Brothers came upon an ornithology book that kindled their interest in flight. HS Truman cited the library as influential. Lilly Tomlin, too. Bill Clinton, too, especially loving books about native Americans. Barack Obama, too. “The first place I wanted to be was a library,” he said when he returned from overseas. He was especially interested in Kenya, the home of his father.

For most of its history, library info science discourse has focused on what was “useful knowledge” in the 19th century, “best books” in the 20th century, or what we now call “information.” Because people don’t have to use libraries (unlike, say, courts) users have greatly influenced the shape of libraries.

“To demonstrate library as place, let me introduce you to Ricky,” he says as he starts a video. She is an adult student who does her homework in the library. When she was broke, it was a warm place where she could apply for jobs.” She has difficulty working through her emotions to express how much the library means to her.

Wayne reads a librarian’s account of the very young MLK’s regular attendance at his public library. James Levine learned to play piano there. In 1969 the Gary Indiana held a talent conference; the Jackson brothers didn’t win, but Michael became a local favorite. [Who won???] In another library, a homeless man–Mr. Conrad– came in and set up a chess board. People listened and learned from him.

“To categorize these activities as information gathering fails to appreciate the richness” of the meaning of the library for these places.

Wayne plays another video. Maria is 95 years old. She started using the library when was 12 or 13 after her family had immigrated from Russia. “That library was everything to me.” Her family could not afford to buy books “and there were some many other servicces, it was library library library all the time.” “I have seen many ugly things. You can’t live all the time with the bad.” The library was something beautiful.

Pete Seeger remembered all his life stories he read in the library.

The young Ronald Reagan read a popular Christian novel, declared himself saved, and had himself baptized. He went to his public library twice a week, mainly reading adventure stories.

Oprah Winfrey’s library taught her that there was a better world and that she could be a part of it.

Sonia Sotamayor buried herself in reading in the public library after her father died when she was nine. Nancy Drew was formative: paying attention, finding clues, reaching logical conclusions.

Wayne plays a video of Danny, a young man who learned about music from CDs in the library, and found a movie that “dropped an emotional anchor down so I didn’t feel like I was floundering” in his sexuality.

Public libraries have always played a role in making stories accessible to everyone. Communities insist that libraries stock a set of stories that the community responds to. Stories stimulate imagination, construct community through shared reading, and make manifest moral weightings.

In his book, Wayne gives story, people, and place equal weight. “Stories and libraries as place has been as important, and for many people, more important than information.” We need to look at how these activities product human subjectivity as community-based. We lack a research base to comprehend the many ways libraries are used.

The death of libraries has been pronounced too early. In 2012, the US has more libraries than ever. Attendance in 2012 dipped because the hours libraries are open went down that year, but for the decade it was up 28%. [May have gotten the number wrong a bit.] In 2012, libraries circulated 2.2B items, up 28% from 2003. And more. [Too fast to capture.] The prophets of doom have too narrow a view of what libraries do and are. “We have to expand the boundaries of our professional discourse beyond information.”

Libraries fighting against budget cuts too often replicate the stereotypes. “Public libraries no longer are warehouses of book” gives credence to the falsehood that libraries ever were that.

He ends by introducing Dawn Logsdon who is working on a film for 2017 titled Free for All: Inside the Public Library. (She’s been taping people at the conference and assures the audience that whatever doesn’t make into the film will be available online.) She shows a few minutes of a prior documentary of hers: Faubourg Treme.

The post [2b2k][liveblog] Wayne Wiegand: Libraries beyond information appeared first on Joho the Blog.


[siu] Accessing content

Alex Hodgson of ReadCube is leading a panel called “Accessing Content: New Thinking and New Business Models or Accessing Research Literature” at the Shaking It Up conference.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Robert McGrath is from ReadCube, a platform for managing references. You import your pdfs, read them with their enhanced reader, and can annotate them and discover new content. You can click on references in the PDF and go directly to the sources. If you hit a pay wall, they provide a set of options, including a temporary “checkout” of the article for $6. Academic libraries can set up a fund to pay for such access.

Eric Hellman talks about Everyone in the book supply chain wants a percentage. But free e-books break the system because there are no percentages to take. “Even libraries hate free ebooks.” So, how do you give access to Oral Literature in Africain Africa? ran a campaign, raised money, and liberated it. How do you get free textbooks into circulation? Teachers don’t know what’s out there. is creating MARC records for these free books to make it easy for libraries to include the. The novel Zero Sum Game is a great book that the author put it out under a Creative Commons license, but how do you find out that it’s available? Likewise for Barbie: A Computer Engineer, which is a legal derivative of a much worse book. has over 1,000 creative commons licensed books in their collection. One of’s projects: an author pledges to make the book available for free after a revenue target has been met. [Great! A bit like the Library License project from the Harvard Library Innovation Lab. They’re now doing Thanks for Ungluing which aggregates free ebooks and lets you download them for free or pay the author for it. [Plug: John Sundman’s Biodigital is available there. You definitely should pay him for it. It’s worth it.]

Marge Avery, ex of MIT Press and now at MIT Library, says the traditional barriers sto access are price, time, and format. There are projects pushing on each of these. But she mainly wants to talk about format. “What does content want to be?” Academic authors often have research that won’t fit in the book. Univ presses are experimenting with shorter formats (MIT Press Bits), new content (Stanford Briefs), and publishing developing, unifinished content that will become a book (U of Minnesota). Cambridge Univ Press published The History Manifesto, created start to finish in four months and is available as Open Access as well as for a reasonable price; they’ve sold as many copies as free copies have been downloaded, which is great.

William Gunn of Mendeley talks about next-gen search. “Search doesn’t work.” Paul Kedrosky was looking for a dishwasher and all he found was spam. (Dishwashers, and how Google Eats Its Own Tail). Likewise, Jeff Atwood of StackExchange: “Trouble in the House of Google.” And we have the same problems in scholarly work. E.g., Google Scholar includes this as a scholarly work. Instead, we should be favoring push over pull, as at Mendeley. Use behavior analysis, etc. “There’s a lot of room for improvement” in search. He shows a Mendeley search. It auto-suggests keyword terms and then lets you facet.

Jenn Farthing talks about JSTOR’s “Register and Read” program. JSTOR has 150M content accesses per year, 9,000 institutions, 2,000 archival journals, 27,000 books. Register and Read: Free limited access for everyone. Piloted with 76 journals. Up to 3 free reads over a two week period. Now there are about 1,600 journals, and 2M users who have checked out 3.5M articles. (The journals are opted in to the program by their publishers.)


Q: What have you learned in the course of these projects?

ReadCube: UI counts. Tracking onsite behavior is therefore important. Iterate and track.

Marge: It’d be good to have more metrics outside of sales. The circ of the article is what’s really of importance to the scholar.

Mendeley: Even more attention to the social relationships among the contributors and readers.

JSTOR: You can’t search for only content that’s available to you through Read and Register. We’re adding that. started out as a crowdfunding platform for free books. We didn’t realize how broken the supply chain is. Putting a book on a Web site isn’t enough. If we were doing it again, we’d begin with what we’re doing now, Thanks for Ungluing, gathering all the free books we can find.

Q: How to make it easier for beginners?

Unglue .it: The publishing process is designed to prevent people from doing stuff with ebooks. That’s a big barrier to the adoption of ebooks.

ReadCube: Not every reader needs a reference manager, etc.

Q: Even beginning students need articles to interoperate.

Q: When ReadCube negotiates prices with publishers, how does it go?

ReadCube: In our pilots, we haven’t seen any decline in the PDF sales. Also, the cost per download in a site license is a different sort of thing than a $6/day cost. A site license remains the most cost-effective way of acquiring access, so what we’re doing doesn’t compete with those licenses.

Q: The problem with the pay model is that you can’t appraise the value of the article until you’ve paid. Many pay models don’t recognize that barrier.

ReadCube: All the publishers have agreed to first-page previews, often to seeing the diagrams. We also show a blurred out version of the pages that gives you a sense of the structure of the article. It remains a risk, of course.

Q: What’s your advice for large legacy publishers?

ReadCube: There’s a lot of room to explore different ways of brokering access — different potential payers, doing quick pilots, etc.

Mendeley: Make sure your revenue model is in line with your mission, as Geoff said in the opening session.

Marge: Distinguish the content from the container. People will pay for the container for convenience. People will pay for a book in Kindle format, while the content can be left open.

Mendeley: Reading a PDF is of human value, but computing across multiple articles is of emerging value. So we should be getting past the single reader business model.

JSTOR: Single article sales have not gone down because of Read and Register. They’re different users. Traditional publishers should cut their cost basis. They have fancy offices in expensive locations. They need to start thinking about how they can cut the cost of what they do.


[2b2k] Four things to learn in a learning commons

Last night I got to give a talk at a public meeting of the Gloucester Education Foundation and the Gloucester Public School District. We talked about learning commons and libraries. It was awesome to see the way that community comports itself towards its teachers, students and librarians, and how engaged they are. Truly exceptional.

Afterwards there were comments by Richard Safier (superintendent), Deborah Kelsey (director of the Sawyer Free Library), and Samantha Whitney (librarian and teacher at the high school), and then a brief workshop at the attendees tables. The attendees included about a dozen of Samantha’s students; you can see in the liveliness of her students and the great questions they asked that Samantha is an inspiring teacher.

I came out of these conversations thinking that if my charter were to establish a “learning commons” in a school library, I’d ask what sort of learning I want to be modeled in that space. I think I’d be looking for four characteristics:

1. Students need to learn the basics (and beyond!) of online literacy: not just how to use the tools, but, more important, how to think critically in the networked age. Many schools are recognizing that, thankfully. But it’s something that probably will be done socially as often as not: “Can I trust a site?” is a question probably best asked of a network.

2. Old-school critical thinking was often thought of as learning how to sift claims so that only that which is worth believing makes it through. Those skills are of course still valuable, but on a network we are almost always left with contradictory piles of sifted beliefs. Sometimes we need to dispute those other beliefs because they are simply wrong. But on a network we also need to learn to live with difference — and to appreciate difference — more than ever. So, I would take learning to love difference to be an essential skill.

3. It kills me that most people have never clicked on a Wikipedia “Talk” page to see the discussion that resulted in the article they’re reading. If we’re going to get through this thing — life together on this planet — we’re really going to have to learn to be more meta-aware about what we read and encounter online. The old trick of authority was to erase any signs of what produced the authoritative declaration. We can’t afford that any more. We need always to be aware the what we come across resulted from humans and human processes.

4. We can’t rely on individual brains. We need brains that are networked with other brains. Those networks can be smarter than any of their individual members, but only if the participants learn how to let the group make them all smarter instead of stupider.

I am not sure how these skills can be taught — excellent educators and the communities that support them, like those I met last night, are in a better position to figure it out — but they are four skills that seem highly congruent with a networked learning commons.


Library as starting point

A new report on Ithaka S+R‘s annual survey of libraries suggests that library directors are committed to libraries being the starting place for their users’ research, but that the users are not in agreement. This calls into question the expenditures libraries make to achieve that goal. (Hat tip to Carl Straumsheim and Peter Suber.)

The question is good. My own opinion is that libraries should let Google do what it’s good at, while they focus on what they’re good at. And libraries are very good indeed at particular ways of discovery. The goal should be to get the mix right, not to make sure that libraries are the starting point for their communities’ research.

The Ithaka S+R survey found that “The vast majority of the academic library directors…continued to agree strongly with the statement: ‘It is strategically important that my library be seen by its users as the first place they go to discover scholarly content.'” But the survey showed that only about half think that that’s happening. This gap can be taken as room for improvement, or as a sign that the aspiration is wrongheaded.

The survey confirms that many libraries have responded to this by moving to a single-search-box strategy, mimicking Google. You just type in a couple of words about what you’re looking for and it searches across every type of item and every type of system for managing those items: images, archival files, books, maps, museum artifacts, faculty biographies, syllabi, databases, biological specimens… Just like Google. That’s the dream, anyway.

I am not sold on it. Roger cites Lorcan Dempsey, who is always worth listening to:

Lorcan Dempsey has been outspoken in emphasizing that much of “discovery happens elsewhere” relative to the academic library, and that libraries should assume a more “inside-out” posture in which they attempt to reveal more effectively their distinctive institutional assets.

Yes. There’s no reason to think that libraries are going to be as good at indexing diverse materials as Google et al. are. So, libraries should make it easier for the search engines to do their job. Library platforms can help. So can as a way of enriching HTML pages about library items so that the search engines can easily recognize the library item metadata.

But assuming that libraries shouldn’t outsource all of their users’ searches, then what would best serve their communities? This is especially complicated since the survey reveals that preference for the library web site vs. the open Web varies based on just about everything: institution, discipline, role, experience, and whether you’re exploring something new or keeping up with your field. This leads Roger to provocatively ask:

While academic communities are understood as institutionally affiliated, what would it entail to think about the discovery needs of users throughout their lifecycle? And what would it mean to think about all the different search boxes and user login screens across publishes [sic] and platforms as somehow connected, rather than as now almost entirely fragmented? …Libraries might find that a less institutionally-driven approach to their discovery role would counterintuitively make their contributions more relevant.

I’m not sure I agree, in part because I’m not entirely sure what Roger is suggesting. If it’s that libraries should offer an experience that integrates all the sources scholars consult throughout the lifecycle of their projects or themselves, then, I’d be happy to see experiments, but I’m skeptical. Libraries generally have not shown themselves to be particularly adept at creating grand, innovative online user experiences. And why should they be? It’s a skill rarely exhibited anywhere on the Web.

If designing great Web experiences is not a traditional strength of research libraries, the networked expertise of their communities is. So is the library’s uncompromised commitment to serving its community’s interests. A discovery system that learns from its community can do something that Google cannot: it can find connections that the community has discerned, and it can return results that are particularly relevant to that community. (It can make those connections available to the search engines also.)

This is one of the principles behind the Stacklife project that came out of the Harvard Library Innovation Lab that until recently I co-directed. It’s one of the principles of the Harvard LibraryCloud platform that makes Stacklife possible. It’s one of the reasons I’ve been touting a technically dumb cross-library measure of usage. These are all straightforward ways to start to record and use information about the items the community has voted for with its library cards.

It is by far just the start. Anonymization and opt-in could provide rich sets of connections and patterns of usage. Imagine we could know what works librarians recommend in response to questions. Imagine if we knew which works were being clustered around which topics in lib guides and syllabi. (Support the Open Syllabus Project!) Imagine if we knew which books were being put on lists by faculty and students. Imagine if knew what books were on participating faculty members’ shelves. Imagine we could learn which works the community thinks are awesome. Imagine if we could do this across institutions so that communities could learn from one another. Imagine we could do this with data structures that support wildly messily linked sources, many of them within the library but many of them outside of it. (Support Linked Data!)

Let the Googles and Bings do what they do better than any sane person could have imagined twenty years ago. Let libraries do what they have been doing better than anyone else for centuries: supporting and learning from networked communities of scholars, librarians, and students who together are a profound source of wisdom and working insight.


[2b2k] Digital Humanities: Ready for your 11AM debunking?

The New Republic continues to favor articles debunking claims that the Internet is bringing about profound changes. This time it’s an article on the digital humanities, titled “The Pseudo-Revolution,” by Adam Kirsch, a senior editor there. [This seems to be the article. Tip of the hat to Jose Afonso Furtado.]

I am not an expert in the digital humanities, but it’s clear to the people in the field who I know that the meaning of the term is not yet settled. Indeed, the nature and extent of the discipline is itself a main object of study of those in the discipline. This means the field tends to attract those who think that the rise of the digital is significant enough to warrant differentiating the digital humanities from the pre-digital humanities. The revolutionary tone that bothers Adam so much is a natural if not inevitable consequence of the sociology of how disciplines are established. That of course doesn’t mean he’s wrong to critique it.

But Adam is exercised not just by revolutionary tone but by what he perceives as an attempt to establish claims through the vehemence of one’s assertions. That is indeed something to watch out for. But I think it also betrays a tin-eared reading by Adam. Those assertions are being made in a context the authors I think properly assume readers understand: the digital humanities is not a done deal. The case has to be made for it as a discipline. At this stage, that means making provocative claims, proposing radical reinterpretations, and challenging traditional values. While I agree that this can lead to thoughtless triumphalist assumptions by the digital humanists, it also needs to be understood within its context. Adam calls it “ideological,” and I can see why. But making bold and even over-bold claims is how discourses at this stage proceed. You challenge the incumbents, and then you challenge your cohort to see how far you can go. That’s how the territory is explored. This discourse absolutely needs the incumbents to push back. In fact, the discourse is shaped by the assumption that the environment is adversarial and the beatings will arrive in short order. In this case, though, I think Adam has cherry-picked the most extreme and least plausible provocations in order to argue against the entire field, rather than against its overreaching. We can agree about some of the examples and some of the linguistic extensions, but that doesn’t dismiss the entire effort the way Adam seems to think it does.

It’s good to have Adam’s challenge. Because his is a long and thoughtful article, I’ll discuss the thematic problems with it that I think are the most important.

First, I believe he’s too eager to make his case, which is the same criticism he makes of the digital humanists. For example, when talking about the use of algorithmic tools, he talks at length about Franco Moretti‘s work, focusing on the essay “Style, Inc.: Reflections on 7,000 Titles.” Moretti used a computer to look for patterns in the titles of 7,000 novels published between 1740 and 1850, and discovered that they tended to get much shorter over time. “…Moretti shows that what changed was the function of the title itself.” As the market for novels got more crowded, the typical title went from being a summary of the contents to a “catchy, attention-grabbing advertisement for the book.” In addition, says Adam, Moretti discovered that sensationalistic novels tend to begin with “The” while “pioneering feminist novels” tended to begin with “A.” Moretti tenders an explanation, writing “What the article ‘says’ is that we are encountering all these figures for the first time.”

Adam concludes that while Moretti’s research is “as good a case for the usefulness of digital tools in the humanities as one can find” in any of the books under review, “its findings are not very exciting.” And, he says, you have to know which questions to ask the data, which requires being well-grounded in the humanities.

That you need to be well-grounded in the humanities to make meaningful use of digital tools is an important point. But here he seems to me to be arguing against a straw man. I have not encountered any digital humanists who suggest that we engage with our history and culture only algorithmically. I don’t profess expertise in the state of the digital humanities, so perhaps I’m wrong. But the digital humanists I know personally (including my friend Jeffrey Schnapp, a co-author of a book, Digital_Humanities, that Adam reviews) are in fact quite learned lovers of culture and history. If there is indeed an important branch of digital humanities that says we should entirely replace the study of the humanities with algorithms, then Adam’s criticism is trenchant…but I’d still want to hear from less extreme proponents of the field. In fact, in my limited experience, digital humanists are not trying to make the humanities safe for robots. They’re trying to increase our human engagement with and understanding of the humanities.

As to the point that algorithmic research can only “illustrate a truism rather than discovering a truth,” — a criticism he levels even more fiercely at the Ngram research described in the book Uncharted — it seems to me that Adam is missing an important point. If computers can now establish quantitatively the truth of what we have assumed to be true, that is no small thing. For example, the Ngram work has established not only that Jewish sources were dropped from German books during the Nazi era, but also the timing and extent of the erasure. This not only helps make the humanities more evidence-based —remember that Adam criticizes the digital humanists for their argument-by-assertion —but also opens the possibility of algorithmically discovering correlations that overturn assumptions or surprise us. One might argue that we therefore need to explore these new techniques more thoroughly, rather than dismissing them as adding nothing. (Indeed, the NY Times review of Uncharted discusses surprising discoveries made via Ngram research.)

Perhaps the biggest problem I have with Adam’s critique I’ve also had with some digital humanists. Adam thinks of the digital humanities as being about the digitizing of sources. He then dismisses that digitizing as useful but hardly revolutionary: “The translation of books into digital files, accessible on the Internet around the world, can be seen as just another practical tool…which facilitates but does not change the actual humanistic work of thinking and writing.”

First, that underplays the potential significance of making the works of culture and scholarship globally available.

Second, if you’re going to minimize the digitizing of books as merely the translation of ink into pixels, you miss what I think is the most important and transformative aspect of the digital humanities: the networking of knowledge and scholarship. Adam in fact acknowledges the networking of scholarship in a twisty couple of paragraphs. He quotes the following from the book Digital_Humanities:

The myth of the humanities as the terrain of the solitary genius…— a philosophical text, a definitive historical study, a paradigm-shifting work of literary criticism — is, of course, a myth. Genius does exist, but knowledge has always been produced and accessed in ways that are fundamentally distributed…

Adam responds by name-checking some paradigm-shifting works, and snidely adds “you can go to the library and check them out…” He then says that there’s no contradiction between paradigm-shifting works existing and the fact that “Scholarship is always a conversation…” I believe he is here completely agreeing with the passage he thinks he’s criticizing: genius is real; paradigm-shifting works exist; these works are not created by geniuses in isolation.

Then he adds what for me is a telling conclusion: “It’s not immediately clear why things should change just because the book is read on a screen rather than on a page.” Yes, that transposition doesn’t suggest changes any more worthy of research than the introduction of mass market paperbacks in the 1940s [source]. But if scholarship is a conversation, might moving those scholarly conversations themselves onto a global network raise some revolutionary possibilities, since that global network allows every connected person to read the scholarship and its objects, lets everyone comment, provides no natural mechanism for promoting any works or comments over any others, inherently assumes a hyperlinked rather than sequential structure of what’s written, makes it easier to share than to sequester works, is equally useful for non-literary media, makes it easier to transclude than to include so that works no longer have to rely on briefly summarizing the other works they talk about, makes differences and disagreements much more visible and easily navigable, enables multiple and simultaneous ordering of assembled works, makes it easier to include everything than to curate collections, preserves and perpetuates errors, is becoming ubiquitously available to those who can afford connection, turns the Digital Divide into a gradient while simultaneously increasing the damage done by being on the wrong side of that gradient, is reducing the ability of a discipline to patrol its edges, and a whole lot more.

It seems to me reasonable to think that it is worth exploring whether these new affordances, limitations, relationships and metaphors might transform the humanities in some fundamental ways. Digital humanities too often is taken simply as, and sometimes takes itself as, the application of computing tools to the humanities. But it should be (and for many, is) broad enough to encompass the implications of the networking of works, ideas and people.

I understand that Adam and others are trying to preserve the humanities from being abandoned and belittled by those who ought to be defending the traditional in the face of the latest. That is a vitally important role, for as a field struggling to establish itself digital humanities is prone to over-stating its case. (I have been known to do so myself.) But in my understanding, that assumes that digital humanists want to replace all traditional methods of study with computer algorithms. Does anyone?

Adam’s article is a brisk challenge, but in my opinion he argues too hard against his foe. The article becomes ideological, just as he claims the explanations, justifications and explorations offered by the digital humanists are.

More significantly, focusing only on the digitizing of works and ignoring the networking of their ideas and the people discussing those ideas, glosses over the locus of the most important changes occurring within the humanities. Insofar as the digital humanities focus on digitization instead of networking, I intend this as a criticism of that nascent discipline even more than as a criticism of Adam’s article.


[2b2k] In defense of the library Long Tail

Two percent of Harvard’s library collection circulates every year. A high percentage of the works that are checked out are the same as the books that were checked out last year. This fact can cause reflexive tsk-tsking among librarians. But — with some heavy qualifications to come — this is at it should be. The existence of a Long Tail is not a sign of failure or waste. To see this, consider what it would be like if there were no Long Tail.

Harvard’s 73 libraries have 16 million items [source]. There are 21,000 students and 2,400 faculty [source]. If we guess that half of the library items are available for check-out, which seems conservative, that would mean that 160,000 different items are checked out every year. If there were no Long Tail, then no book would be checked out more than any other. In that case, it would take the Harvard community an even fifty years before anyone would have read the same book as anyone else. And a university community in which across two generations no one has read the same book as anyone else is not a university community.

I know my assumptions are off. For example, I’m not counting books that are read in the library and not checked out. But my point remains: we want our libraries to have nice long tails. Library long tails are where culture is preserved and discovery occurs.

And, having said that, it is perfectly reasonable to work to lower the difference between the Fat Head and the Long Tail, and it is always desirable to help people to find the treasures in the Long Tail. Which means this post is arguing against a straw man: no one actually wants to get rid of the Long Tail. But I prefer to put it that this post argues against a reflex of thought I find within myself and have encountered in others. The Long Tail is a requirement for the development of culture and ideas, and at the same time, we should always help users to bring riches out of the Long Tail


Linked Data for Libraries: And we’re off!

I’m just out of the first meeting of the three universities participating in a Mellon grant — Cornell, Harvard, and Stanford, with Cornell as the grant instigator and leader — to build, demonstrate, and model using library resources expressed as Linked Data as a tool for researchers, student, teachers, and librarians. (Note that I’m putting all this in my own language, and I was certainly the least knowledgeable person in the room. Don’t get angry at anyone else for my mistakes.)

This first meeting, two days long, was very encouraging indeed: it’s a superb set of people, we are starting out on the same page in terms of values and principles, and we enjoyed working with one another.

The project is named Linked Data for Libraries (LD4L) (minimal home page), although that doesn’t entirely capture it, for the actual beneficiaries of it will not be libraries but scholarly communities taken in their broadest sense. The idea is to help libraries make progress with expressing what they know in Linked Data form so that their communities can find more of it, see more relationships, and contribute more of what the communities learn back into the library. Linked Data is not only good at expressing rich relations, it makes it far easier to update the dataset with relationships that had not been anticipated. This project aims at helping libraries continuously enrich the data they provide, and making it easier for people outside of libraries — including application developers and managers of other Web sites — to connect to that data.

As the grant proposal promised, we will use existing ontologies, adapting them only when necessary. We do expect to be working on an ontology for library usage data of various sorts, an area in which the Harvard Library Innovation Lab has done some work, so that’s very exciting. But overall this is the opposite of an attempt to come up with new ontologies. Thank God. Instead, the focus is on coming up with implementations at all three universities that can serve as learning models, and that demonstrate the value of having interoperable sets of Linked Data across three institutions. We are particularly focused on showing the value of the high-quality resources that libraries provide.

There was a great deal of emphasis in the past two days on partnerships and collaboration. And there was none of the “We’ll show ‘em where they got it wrong, by gum!” attitude that in my experience all too often infects discussions on the pioneering edge of standards. So, I just got to spend two days with brilliant library technologists who are eager to show how a new generation of tech, architecture, and thought can amplify the already immense value of libraries.

There will be more coming about this effort soon. I am obviously not a source for tech info; that will come soon and from elsewhere.


Aaron Swartz and the future of libraries

I was unable to go to our local Aaron Swartz Hackathon, one of twenty around the world, because I’d committed (very happily) to give the after dinner talk at the University of Rhode Island Graduate Library and Information Studies 50th anniversary gala last night.

The event brought together an amazing set of people, including Senator Jack Reed, the current and most recent presidents of the American Library Association, Joan Ress Reeves, 50 particularly distinguished alumni (out of the three thousand (!) who have been graduated), and many, many more. These are heroes of libraries. (My cousin’s daughter, Alison Courchesne, also got an award. Yay, Alison!)

Although I’d worked hard on my talk, I decided to open it differently. I won’t try to reproduce what I actually said because the adrenalin of speaking in front of a crowd, especially one as awesome as last night’s, wipes out whatever short term memory remains. But it went very roughly something like this:

It’s awesome to be in a room with teachers, professors, researchers, a provost, deans, and librarians: people who work to make the world better…not to mention the three thousand alumni who are too busy do-ing to be able to be here tonight.

But it makes me remember another do-er: Aaron Swartz, the champion of open access, open data, open metadata, open government, open everything. Maybe I’m thinking about Aaron tonight because today is his birthday.

When we talk about the future of libaries, I usually promote the idea of libraries as platforms — platforms that make openly available everything that libraries know: all the data, all the metadata, what the community is making of what they get from the library (privacy accommodated, of course), all the guidance and wisdom of librarians, all the content especially if we can ever fix the insane copyright laws. Everything. All accessible to anyone who wants to write an application that puts it to use.

And the reason for that is because in my heart I don’t think librarians are going to invent the future of libraries. It’s too big a job for any one group. It will take the world to invent the future of libraries. It will take 14 year olds like Aaron to invent the future of libraries. We need supply them with platforms that enable them.

I should add that I co-direct a Library Innovation Lab where we do work that I’m very proud of. So, of course libraries will participate in the invention of their future. But it’ll take the world — a world that contains people with the brilliance and commitment of an Aaron Swartz — to invent that future fully.


Here are wise words delivered at an Aaron Hackathon last night by Carl Malamud: Hacking Authority. For me, Carl is reminding us that the concept of hacking over-promises when the changes threaten large institutions that represent long-held values and assumptions. Change often requires the persistence and patience that Aaron exhibited, even as he hacked.


[annotation][2b2k] Neel Smith: Scholarly annotation + Homer

Neel Smith of Holy Cross is talking about the Homer Multitext project, a “long term project to represent the transmission of the Iliad in digital form.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He shows the oldest extant ms of the Iliad, which includes 10th century notes. “The medieval scribes create a wonderful hypermedia” work.

“Scholarly annotation starts with citation.” He says we have a good standard: URNs, which can point to, for example, and ISBN number. His project uses URNs to refer to texts in a FRBR-like hierarchy [works at various levels of abstraction]. These are semantically rich and machine-actionable. You can google URN and get the object. You can put a URN into a URL for direct Web access. You can embed an image into a Web page via its URN [using a service, I believe].

An annotation is an association. In a scholarly notation, it’s associated with a citable entity. [He shows some great examples of the possibilities of cross linking and associating.]

The metadata is expressed as RDF triples. Within the Homer project, they’re inductively building up a schema of the complete graph [network of connections]. For end users, this means you can see everything associated with a particular URN. Building a facsimile browser, for example, becomes straightforward, mainly requiring the application of XSL and CSS to style it.

Another example: Mise en page: automated layout analysis. This in-progress project analyzes the layout of annotation info on the Homeric pages.