An IBM researcher has
uncovered a way to analyze data while it is still encrypted, in what could be a
boon for both spam-filtering applications and cloud computing environments.
The challenge of manipulating data without exposing it has bugged
cryptographers for decades. But in a breakthrough, IBM
researcher and Stanford University Ph.D. candidate Craig Gentry has developed a
"fully homomorphic encryption" scheme that keeps data
protected.
"Basically, an annoying property
of encryption is that it typically traps data inside a box that prevents
the data from being used or analyzed until you open the box with the secret
decryption key," Gentry said. "What a fully homomorphic encryption
scheme allows you to do is analyze or compute a function of the data while it
remains securely inside the box.
"For example, suppose you want to store your files on an untrusted
server," he continued. "You don't want the server to see your files;
so, you want to encrypt them. On the other hand, you want to be able to access
your files in an 'intelligent' way ... There is a way to express this query as
a function f. If your files are encrypted with the fully homomorphic encryption
scheme, you can send your query to the server, which expresses it as some
function f; the server then homomorphically computes an encryption of f(m_1,
..., m_t) -- where f(m_1, ..., m_t) outputs the relevant files—and sends this
ciphertext back to you. You decrypt to recover the files."
It is also possible to encrypt the query as well, he said.
According to IBM, Gentry's solution could
help strengthen the business model of cloud computing when vendors are hosting
confidential customer data, by enabling them to perform computations on data at
their clients' request without exposing the original data.
It also represents a potential boost for spam filtering of encrypted
e-mails. Spam filtering, he noted, doesn't need to be done online because of
the relative asynchronicity of e-mail, and because the amount of data is likely
to be small.
"The idea for spam-filtering encrypted e-mails is as follows: I encrypt
e-mails to you under your public key pk using the fully homomorphic encryption
scheme. Maybe some spammers also send you some e-mails encrypted under
pk. Now, a spam filter can be expressed as some function f that is applied
to (unencrypted) e-mail data, which outputs a '0' if it is spam, and a '1' if
it is not," Gentry explained.
"To spam-filter encrypted e-mails, the spam filter uses the fully
homomorphic encryption scheme to compute a ciphertext that encrypts f(m), where
the encrypted e-mail is the encryption of m. The filter prepends this
ciphertext to the encryption of m. Now, when you check your encrypted e-mail,
you first decrypt the prepended ciphertext. If the ciphertext encrypted a
'0', indicating the message is spam, you don't bother decrypting the rest of
the e-mail. Otherwise, you decrypt the rest."
At this point, Gentry said, the next step is to work on improving the efficiency
of his discovery. He admitted it takes many times longer to process data
homomorphically while it is in the encryption box than it would to process the
data in the clear.
"Before it becomes a tool, more theoretical work will need to be done
to make it more efficient," he said. "But now that researchers know
that fully homomorphic encryption is possible and have an actual construction
that they can sink their teeth into, I think there is reason to be optimistic
that the efficiency will be improved dramatically."