Archive for April, 2011

Big Data Models: Help me crowdsource sources

I’m thrilled that I’m going to be writing an article for Scientific American on big data models — models that cover some huge swath of life, such as the economy, the climate, sociopolitical change, etc. What’s the promise and what are the challenges? How far can such models scale?

So, who do you think I should interview? What projects strike you as particularly illuminating? Let me know in the comments, or at




Esquire’s article (by Tom Junod ) about Eric Schadt and non-reductive molecular biology would have been chapter fodder for 2b2k if the book weren’t (I hope) done. Fascinating.

(See the brief but interesting discussion at metafilter.)


How-to guide for moving a journal to Open Access

The Association for Learning Technology has published a detailed and highly practical guide, based on its own experience, for journals moving toward an Open Access model. Indeed, the guide is of even broader utility than that, since it considers the practicalities of moving from an existing contract with publishers for any reason.

ALT’s journal has been renamed Research in Learning Technology, and it will be fully Open Access as of January 2012. (Thanks to Seb Schmoller for the tip.)


[2b2k] How much information

The latest issue of Science (April 1, DOI: 10.1126/science.1200970) has an article (protected by copyright from your prying eyes) by Martin Hilbert and Priscilla Lopez about the increase in information from 1986-2007. Or, to be more exact, here’s the abstract:

We estimated the world’s technological capacity to store, communicate, and compute information, tracking 60 analog and digital technologies during the period from 1986 to 2007. In 2007, humankind was able to store 2.9 × 1020 optimally compressed bytes, communicate almost 2 × 1021 bytes, and carry out 6.4 × 1018 instructions per second on general-purpose computers. General-purpose computing capacity grew at an annual rate of 58%. The world’s capacity for bidirectional telecommunication grew at 28% per year, closely followed by the increase in globally stored information (23%). Humankind’s capacity for unidirectional information diffusion through broadcasting channels has experienced comparatively modest annual growth (6%). Telecommunication has been dominated by digital technologies since 1990 (99.9% in digital format in 2007), and the majority of our technological memory has been in digital format since the early 2000s (94% digital in 2007).

To take care of the redundancy of stored information, they normalized around the optimal compression rates of 2007.

I found the following interesting:

Although there are only 8% more broadcast devices in the world than telecommunication equipment (6.66 billion versus 6.15 billion in 2007), the average broadcasting device communicates 27 times more information per day than the average telecommunications gadget. This result might be unexpected, especially considering the omnipresence of the Internet, but can be understood when considering that an average Internet subscription effectively uses its full bandwidth for only around 9 min per day (during an average 1 hour and 36 min daily session).

The rest of the time we’re merely paying for full broadband access…while the access providers call a “bandwidth hog” anyone who actually uses the broadband she’s paying for. But that’s not the article’s point. (Hat tip to Andy Weinberger for the link.)