Publication Date


Date of Final Oral Examination (Defense)


Type of Culminating Activity


Degree Title

Master of Science in Computer Science

Department Filter

Computer Science


Computer Science

Supervisory Committee Chair

Gaby Dagher, Ph.D.

Supervisory Committee Member

Bogdan Dit, Ph.D.

Supervisory Committee Member

Jyh-Haw Yeh, Ph.D.


Privacy-preserving distributed data mining is the study of mining on distributed data—owned by multiple data owners—in a non-secure environment, where the mining protocol does not reveal any sensitive information to the data owners, the individual privacy is preserved, and the output mining model is practically useful. In this thesis, we propose a secure two-party protocol for building a privacy-preserving decision tree classifier over distributed data using differential privacy. We utilize secure multiparty computation to ensure that the protocol is privacy-preserving. Our algorithm also utilizes parallel and sequential compositions, and applies distributed exponential mechanism to ensure that the output is differentially-private. We implemented our protocol in a distributed environment on real-life data, and the experimental results show that the protocol produces decision tree classifiers with high utility while being reasonably efficient and scalable.

