Toppling the Great Firewall of China

Forget your typical firewall: China's censors rely on sophisticated and ancient techniques.

The Great Firewall of China is no firewall after all.

The Peoples Republic of China has no firewall perched on its routers to enable censors to block Internet sites.

Rather, the authoritarian regime relies on a far more sophisticated censorship system that uses a keyword blacklist and routers that reach deep into Internet traffic to find forbidden words or phrases.

"Conventional wisdom was its a firewall—all around the border, youd be blocked. We found that sometimes [it takes a few hops within China to get blocked], up to 13 hops. Some paths werent filtered at all," Jed Crandall, an assistant professor of computer science at University of New Mexicos School of Engineering, told eWEEK.

In fact, the "Great Firewall of China" that researchers believe is used by the government to block users from accessing what it considers objectionable content is in reality a "panopticon"—a type of prison that relies on prisoners not being able to tell whether or not theyre being observed.

The group of researchers, which includes some researchers from the University of California-Davis, have found that what theyre calling the Great Firewall of China doesnt have to block every illicit word out there—only enough so that users conduct self-censorship because they know their online movements are being watched.

Indeed, some 28 percent of Chinese hosts that the researchers sent probes to were reachable along paths that werent filtered at all, thus disproving the idea that GFC keyword filtering occurs on a firewall strictly at the border of Chinas connections to the Internet.

Firewall evasion takes on a more complex character, given that Chinese Internet users are tricked into thinking theyre constantly being blocked. The researchers thus are proposing an architecture to bypass GFC keyword filtering that doesnt even bother with firewall evasion.

Instead, theyre working on a tool, called ConceptDoppler, that opts for a surprising route: Namely, to spammify words on Chinas blacklist. First, they have to discover what those words are, and theyre doing so by modulating packets, finding out how many hops packets are using to reach China, and determining which specific routers are doing the blacklist filtering. Those routers are, in fact, sending resets as a way to block download of illicit content.

The researchers say ConceptDoppler will act as a kind of weather report on changes in Internet censorship in China and elsewhere. The tool uses algorithms to cluster words by meaning and to identify keywords that are likely to be blacklisted in China. The researchers have a list of 122 words thus far, but told eWEEK that the blacklist likely contains thousands.

Beyond a topological map of worldwide censorship, however, the researchers also plan to turn ConceptDoppler into a tool that will "spammify" blacklisted words, using the same techniques spammers use to evade filters by separating word characters or inserting random characters into words.


"Spammers show us the way," said Earl Barr, a graduate student in computer science at UC-Davis whos also an author on a paper from the researchers thats titled "ConceptDoppler: A Weather Tracker for Internet Censorship."

"We could find out what the best spamming program is out there—[say], some evil Hungarian [program], and use that spam tool for good now," Barr said.

In that scenario, modules on Web sites would signal when theyre getting a connection from within China. Site operators who know they have blacklisted words in their content could then run their responses through the spammifying tool and then deliver into China content that escapes keyword filtering.

Many words and phrases on the blacklist are predictable, such as "Tibetan Independence Movement," "Falun Gong," "The right to strike," "Tiananmen Square Hunger Strike Group" or "Voice of America."

Some are surprising, such as "conversion rate," "Mein Kampf" or "International geological scientific federation." In some cases, their literal translation into Chinese characters look like possible spellings of other blacklisted words.

