UMass Amherst (Oct 2017 - Dec 2017)

Members: Rohith Pesala, Akul Swamy

Links: Github, Report

Summary:
Motivation for this project is in observing that visual preference of humans affect their decisions. In this work, we implement various models to incorporate this visual information adn show that the results are comparable to the state of the art.



Present recommendation systems use traditional matrix factorization techniques because they have been hard to beat without extra knowledge. With the surge in deep learning and increase in data, now we have found ways to integrate various mediums of data together. This work aims to develop models that can incorporate the visual information from the images and use that to produce a similarity measure between the user and item.

The overall pipeline is an end to end network that takes product image, ID and user ID as input and predicts a visual similarity measure between the user and an item. First, we extract features from the product image and they are fed to a deep network along with the user and item embeddings based on their IDs, We use top-N recommendations method of evaluation and precision, Recall and F-Measure as the metrics.