Author Profiling on social Q&A platforms

Thesis Type Master
Thesis Status
Currently running
Student Benjamin Binder
Thesis Supervisor
Research Field

Author Profiling is a widely researched problem which aims to automatically extract meta information like gender, age or the psychological state of an author of a text document. That is, given a previously unseen text document like an email or a blog entry, algorithms predict that is written by, e.g., a female, who is between 20 and 30 years old and is introverted.

In this thesis, the social question and answer (Q&A) network of StackExchange should be used to create a novel data set that can be used to conduct studies on author profiling. To create the dataset, the API of StackExchange can be utilized, i.e., posts from authors should be extracted together with as much meta information as possible (including gender or age). With this dataset, a machine-learned model should be trained and evaluated using state-of-the-art techniques.