IBM®
Skip to main content
    Privacy Research Institute      Terms of use
 
 
 
     Home      Products      Services & solutions      Support & downloads      My account     
IBM Research

Idemix project   IDEMIX (Identity mixing)

 


Project overview
Personal data is best protected by not being revealed at all

The protection of data and privacy in the Internet is an issue of increasing concern and importance. Such protection requires not only that general standards and laws be agreed upon but also that technical measures be taken. An example of the latter is the "idemix" project at IBM's Zurich Research Laboratory in Rüschlikon.

Modern forms of electronic communication and commerce are such that practically any transaction leaves a data trail in the global Internet, i.e., information about who conducted which transaction with whom and when. Novel in this respect is not the fact that these data trails exist but rather the ease with which large amounts of data from many diverse sources can be gathered, combined, and exploited in a wide variety of ways.

A scenario to illustrate what this can mean for each one of us is quickly given: Anyone who books a hotel room will probably be registered in an electronic system with his or her name, home address, and length of stay. This in itself may have numerous advantages, also for the hotel guest, for example in terms of frequent-flyer mileage. On the other hand, a malicious hotel employee could supply the guest's home address to a complice for a low-risk burglary in the absence of the owner.

A regular customer of an online bookshop might appreciate receiving reading suggestions based on his or her specific preferences, but will be less than pleased if personal data is passed on to a third party, and the reading preferences are exploited for other purposes.

Using the Internet in a variety of ways increasingly puts personal data at risk. But just as modern technology is a threat to one's privacy, it also provides many tools for its protection: Making consistent use of cryptographic encoding when saving and transmitting data protects personal data from uninvolved third parties. A conscientious check of a business partner's identity, for example using digital signatures and public-key infrastructures (PKIs), can help to ensure that personal data are only given to partners deemed trustworthy.

Finally, various technical tools exist to specify the privacy preferences of an individual, for example, which data one is willing to reveal for what purpose and to whom. Moreover, many organizations that process data publish details of their privacy policy, such as which data are gathered, and to whom and for what purpose they may eventually be forwarded.

All these technologies help to prevent an accidental transfer of personal data or a transfer under unclear conditions. But they cannot prevent data being passed on by, for example, a malicious hotel employee or, probably a much more frequent occurrence, being accessed by a curious employee or accidentally revealed because of technical problems.

This is why researchers at IBM's Zurich Research Laboratory in Rüschlikon, Switzerland, go one step further and investigate concepts that embrace "data parsimony." The basic concept is very simple: personal data is best protected if not revealed at all, i.e., if the amount of data revealed is kept to a minimum. The idea is not new, many laws on the protection of personal data contain data frugality as an implicit guideline.

The question then is how "minimum" is defined. Anybody who rents a car has to produce a valid driver's license, thereby - whether voluntarily or involuntarily - revealing a wealth of personal data. Actually, the car rental only needs the name and address of the person renting a car in the event of an emergency. As long as there is no accident, it would suffice to know that the person renting a car is in possession of a valid driver's license. In this case, the data minimum could be quite easily achieved: the name and address in the driver's license could simply be replaced by a randomly chosen artificial name, a pseudonym, provided that in an emergency the name hidden by a pseudonym could be retrieved.

The "idemix" system developed in Rüschlikon uses precisely such pseudonyms for e-commerce transactions: Today, anybody who subscribes to an online service has to register with the service using a user name and a password which he or she has to produce each time he or she wants to access the service. Under "idemix" a user would first select a pseudonym, then register using this pseudonym and receive the corresponding credentials with an electronic signature. If later the user wants to access the service, he or she only must first provide proof to the service that the corresponding, digitally signed credentials are in his or her possession.

Of course, a user could merely present his or her pseudonym and the credentials; however, in many cases this would invalidate the desired data-protection advantages in that the online service could monitor when and how a user uses the service, which in turn could result in an involuntary de-anonymization. In addition, the online service would often even know who owns the pseudonym, for example, for billing purposes.

By employing modern cryptographic techniques, the so-called Zero-Knowledge proofs, researchers at IBM have succeeded in resolving this issue: the pseudonym and credentials are given to the online service only in encrypted form. Although the online service cannot decrypt the information, it can still employ a clever interaction tactic with the user to verify the authenticity of the encrypted pseudonym and that the users must indeed own correct, digitally signed credentials.

In an equally secure manner the user can supply credentials received from another organization to the online service. The car rental agency in our example could in this way receive proof of possession of a valid driver's license from the authorities, and of a valid credit card from a bank. A user can in principle present his or her credentials any number of times in this way.

Because a new encryption is used every time, the repeated use is hidden from the online service, i.e., the user is not re-identified and thus, so to speak, can act completely anonymously. However, for many applications this total anonymity is undesirable: for example, if a rented car is not returned, the identity of the person who rented the car has to be retrievable. Therefore, the idemix system also has provision for a designated authority who can uncover such an identity. In the case of an "anonymized" driving license, it could for example be the office that issued the license; in a business context it could be a third party trusted by both business partners.

Business value
» Automatic enforcement of EPAL privacy policies in large relational databases.
» Effective management personally identifiable information.
» Improve performace and cost-effectiveness.
» Key market differentiator and a competitive advantage for the enterprise.

 

 

back to top    
    About IBM Privacy Contact