Publication Date

8-2019

Date of Final Oral Examination (Defense)

5-31-2019

Type of Culminating Activity

Thesis

Degree Title

Master of Science in Computer Science

Department Filter

Computer Science

Department

Computer Science

Supervisory Committee Chair

Michael D. Ekstrand, Ph.D

Supervisory Committee Member

Maria Soledad Pera, Ph.D.

Supervisory Committee Member

Hoda Mehrpouyan, Ph.D.

Abstract

Recommender systems are software applications deployed on the Internet to help people find useful items (e.g. movies, books, music, products) by providing recommendation lists. Before deploying recommender systems online, researchers and practitioners generally conduct offline evaluations to compare the accuracy of top- recommendation lists among candidate algorithms using users’ history consumption data. These offline evaluations typically use metrics and methodologies borrowed from machine learning and information retrieval and have several well-known biases that affect the validity of their results, including popularity bias and other biases arising from the missing-not-at-random nature of the data used. The existence of these biases is well-established, but their extent and impact are not as well-studied. In this work, we employ controlled simulations with varying assumptions about the distribution and structure of users’ preferences and the rating process to estimate the distributions of the errors in recommender experiment outcomes as a result of these biases. We calibrate our simulated datasets to mimic key statistics of existing public datasets in different domains and use the simulated data to assess the error in estimating true accuracy with observable rating data. We find inconsistency of the evaluation metric scores and the order in which they rank recommendation algorithms in the synthetic true preference and the observation dataset. Simulation results show that offline evaluations are sometimes fooled by intrinsic effects in the data generation process into mistakenly ranking algorithms. The extent of this effect is sensitive to assumptions.

DOI

https://doi.org/10/18122/td/1581/boisestate

Recommended Citation

Tian, Mucun, "Estimating Error and Bias of Offline Recommender System Evaluation Results" (2019). Boise State University Theses and Dissertations. 1581.
https://doi.org/10/18122/td/1581/boisestate

Download

Included in

Computer Sciences Commons

COinS

ScholarWorks

Boise State University Theses and Dissertations

Estimating Error and Bias of Offline Recommender System Evaluation Results

Publication Date

Date of Final Oral Examination (Defense)

Type of Culminating Activity

Degree Title

Department Filter

Department

Supervisory Committee Chair

Supervisory Committee Member

Supervisory Committee Member

Abstract

DOI

Recommended Citation

Included in

Browse

Links

Search

Author Corner

Links

ScholarWorks

Boise State University Theses and Dissertations

Estimating Error and Bias of Offline Recommender System Evaluation Results

Author

Publication Date

Date of Final Oral Examination (Defense)

Type of Culminating Activity

Degree Title

Department Filter

Department

Supervisory Committee Chair

Supervisory Committee Member

Supervisory Committee Member

Abstract

DOI

Recommended Citation

Included in

Share

Browse

Links

Search

Author Corner

Links