Acceleration and Compression of Deep Click-Through Rate Prediction Models

Thesis Type Master
Thesis Status
Student Andreas Peintner
Thesis Supervisor
Research Field

Deep neural networks have shown great success in many different areas. However, these models often are computationally and memory intensive. Particularly, recommender systems are sensitive to latency. Therefore big models with high inference times are not suitable for production in this field. In this thesis, we will apply several compression methods on a state-of-the-art deep recommender system model to improve latency and to decrease memory consumption while keeping the accuracy as high as possible.