Our research addresses the data privacy and security issues in the emerging technologies. In particular, we design, implement, and evaluate practical systems with provable privacy guarantee for data management in different application scenarios such as healthcare, location-based services, and social networks.
Current Research Projects:
Secure Computation on Encrypted Genomic Data
Ongoing large-scale biomedical research projects focus on capturing the vast amount of meaningful information encoded in the human genome. To facilitate these types of projects it oftentimes requires sharing genomic and clinical data collected by disparate organizations among themselves. Genomic data, being sensitive in nature, necessitate the overall process of sharing, managing and analyzing the data does not reveal the identity of the individuals who contribute their genomic samples. The task of storage and computation on the shared data can be delegated to third party cloud infrastructures as those are highly available and scalable, and also are equipped with large storage and high-performance computation resources. Outsourcing these sensitive genomic data to the third party cloud storage is associated with the challenges of the potential loss, theft or misuse of the data as the server administrator cannot be completely trusted as well as there is no guarantee that the security of the server will not be breached. In this research, we address three potential challenges for secure sharing and computation on the genomic data. The first challenge is to guarantee data privacy. The data stored in the cloud server, as well as the computation carried throughout the entire analysis process should be secured. Even if the cloud server gets compromised, the attacker should not learn anything about the data stored in the cloud. The second challenge is to provide query privacy. The institutions contributing the data, the cloud service provider or an adversary should learn nothing about a query executed by a researcher or research institution. The third challenge is to achieve output privacy. The result of the query should not be disclosed to anybody except the researcher who initiated the query. Our objective is to provide a viable solution for sharing and computation of genomic data which will overcome all of the three challenges mentioned above.
Trusted Hardware Assisted Cryptographic Framework for Efficient Secure Computation
Statistical analysis and machine learning algorithms are now being highly used in many applications where deep and predictive insights need to be uncovered from datasets that are large-scale and diverse. While this kind of analyses has proven to be a powerful data-driven solution to many real-life problems, its use in sensitive domains (like healthcare, finance etc.) exposes many privacy and security threats. In addition to addressing the fundamental goal of information extraction, privacy-preserving data analysis should also protect the sensitive information of individuals. To ensure the privacy of sensitive data, different cryptographic techniques have been adopted. These techniques allow some functions to be computed on data without compromising confidentiality and integrity of data from different parties. However, these techniques come with either impractical computational overhead, or communication overhead, or both. Moreover, these techniques support a limited family of computations and demand high storage capacity. To overcome the limitations of existing cryptographic approaches, our aim is to develop a cryptographic framework, which leverages secure hardware component of recently introduced Intel Software Guard Extensions (Intel SGX). Intel SGX is a set of extensions to the Intel architecture, which allows parts of programs to be executed inside secure segments of the CPU called enclaves. Generally, computations are performed on plaintext inside enclaves. Thus, the aim of the research project is to develop a trusted hardware-assisted cryptographic framework that will facilitate performing sophisticated mathematical analysis securely and efficiently.
Secure and Efficient Nearest Neighbour Search on Outsourced Data
The advent of ubiquitous computing, social networking, and other online applications have generated a vast amount of data. This massive volume of data demands an enormous amount of storage, which is difficult to ensure for many organizations. Outsourcing the data directly into the cloud is a potential solution as it enables organizations to have access to a large-scale computation and storage at an affordable price. However, despite the many benefits of the cloud-computing paradigm, organizations are reluctant to outsource the data directly to the cloud due to security concerns (e.g., the recent celebrity iCloud Leak incident). Although various encryption techniques can be used to restrict unauthorized access, these techniques are not suitable for efficient execution of nearest neighbour search operation on a large volume of data. Nearest neighbour search is one of the primitive operations for many applications such as social discovery, recommendation, and health care applications. Motivated by the aforementioned problems, this research addresses the following question: “How can one perform secure and efficient nearest neighbour search over encrypted data in an outsourced environment?”