The E-Mail Petri Dish Grows

News Analysis: Researchers are using a lot more e-mail in their work, but chances are you wouldn't know it because many don't bother to tell you.

Download the authoritative guide: Big Data: Mining Data for Revenue

In the name of science, Columbia University behavioral scientists recently dissected every e-mail sent by every Columbia student for a year.

During the year it took to procure the e-mail and the ensuing months of research on the 14 million or so missives, the 43,000 correspondents never knew they were the subjects of such a grand experiment.

Shockingly, the rather quiet way in which Columbia went about conducting its recently unveiled study is not an isolated case.

Professional and scholastic researchers said during recent interviews that they are digesting increasingly larger amounts of e-mails, and as continues to be the practice, the correspondents never have a clue whats going on. Researchers said they do not feel compelled to get their permission.

Despite a backlash, many researchers say they are actively trying to find ways to fill their research with even more missives, a sign of just how unstoppable the forces at work might be.

For instance, Bernardo Huberman, senior fellow and director of the Information Dynamics Lab at Hewlett-Packard Co., said hes trying to top a recent study that involved a few hundred HP employees mail with one using the e-mails from 50,000 workers.

All the limit-stretching has sparked significant recent debate in a number of areas, such as whether its feasible and even possible to require consent before studying someones e-mail, and what is the appropriate balance between fulfilling a research need and protecting the privacy rights of test subjects.

What comes from the discussions over the next few weeks and months promises to alter the future for both university researchers that rely on e-mail and the enterprises that seek to benefit from their work, either by following through with products or services based on the research or by providing the necessary equipment and services for the research.

University and private researchers assured eWEEK that to a person, they respect the privacy rights of the people involved in these studies.

"I think there are plenty of spooky privacy issues associated with online activity, and I agree that the trade-off between science and privacy is an issue well be thinking about more and more in the future," Duncan Watts, a researcher at the Institute for Social and Economic Research and Policy at Columbia, wrote in an e-mail.

"But I dont think the data in this study should be on your list of worries, compared with the kind of data that Google [Inc.] or Yahoo [Inc.] or AOL [America Online Inc.] or Amazon[.com Inc.], for example, routinely collect and analyze."

/zimages/1/28571.gifRead more here about e-mail privacy issues.

In some ways, its inevitable that peoples private communications are routinely reviewed to uncover even more personal details about them, and no one in charge feels the need to tell anybody about it.

Researchers now have at their disposal cutting-edge research techniques such as e-mail "electroscopy," which dissects a body of e-mail correspondence in newer and more revealing ways. The facile facet has a simple principal: The more e-mail it gets to probe, the more exact and useful the results.

To a person, researchers interviewed for this story agreed that the obvious way to appease those reacting with shock to the news that their mail is part of a petri dish for science or private research is to seek their permission before studying the messages.

But as studies grow in size and thereby effectiveness, getting consent from the subjects becomes much more difficult.

Besides, they explain, researchers are rarely after whats inside the e-mails. Rather, they study the time an e-mail was sent, how long the conversation lasted and other rather innocuous details, to fit against a larger context.

Still, theres a lot of emphasis on the precautions to make the mail anonymous before it gets to researchers.

Next Page: "Self-consistent but otherwise meaningless labels."