PhD Defense: Teaching Machines to Ask Useful Clarification Questions
Inquiry is fundamental to communication, and machines cannot effectively collaborate with humans unless they can ask questions. Asking questions is also a natural way for machines to express uncertainty, a task of increasing importance in an automated society. In the field of natural language processing, despite decades of work on question answering, there is relatively little work in question asking. Moreover, most of the previous work has focused on generating reading comprehension style questions which are answerable from the provided text. The goal of my dissertation work, on the other hand, is to understand how can we teach machines to ask clarification questions that point at the missing information in a text. Primarily, we focus on two scenarios where we find such question asking to be useful: (1) clarification questions on posts found in community-driven technical support forums such as StackExchange (2) clarification questions on descriptions of products in e-retail platforms such as Amazon. In this dissertation we claim that, given large amounts of previously asked questions in various contexts (within a particular scenario), we can build machine learning models that can ask useful questions in a new unseen context (within the same scenario). In order to validate this hypothesis, we firstly create two large datasets of context paired with clarification question (and answer) for the two scenarios of technical support and e-retail by automatically extracting these information from available datadumps of StackExchange and Amazon. Given these datasets, in our first line of research, we build a machine learning model that first extracts a set of candidate clarification questions and then ranks them such that a more useful question would be higher up in the ranking. Our model is inspired by the idea of expected value of perfect information: a good question is one whose expected answer will be useful. We hypothesize that by explicitly modeling the value added by an answer to a given context, our model can learn to identify more useful questions. We evaluate our model against expert human judgments on the StackExchange dataset and demonstrate significant improvements over controlled baselines. In our second line of research, we build a machine learning model that learns to generate a new clarification question from scratch, instead of ranking previously seen questions. We hypothesize that we can train our model to generate good clarification questions by incorporating the usefulness of an answer to the clarification question into the recent sequence-to-sequence based neural network approaches. We develop a Generative Adversarial Network (GAN) where the generator is a sequence-to-sequence model and the discriminator is a utility function that models the value of updating the context with the answer to the clarification question. We evaluate our model on our two datasets of StackExchange and Amazon, using both automatic metrics and human judgments of usefulness, specificity and relevance, showing that our approach outperforms both a retrieval-based model and ablations that exclude the utility model and the adversarial training.We observe that our question generation model generates questions that range a wide spectrum of specificity to the given context. We argue that generating questions at a desired level of specificity (to a given context) can be useful in many scenarios. In our last line of research we, therefore, build a question generation model which given a context and a level of specificity (generic or specific), generates a question at that level of specificity. We hypothesize that by providing the level of specificity of the question to our model during training time, it can learn patterns in the question that indicate the level of specificity and use those to generate questions at a desired level of specificity. To automatically label the large number of questions in our training data with the level of specificity, we train a binary classifier which given a context and a question, predicts whether the question is specific (to the context) or generic. We demonstrate the effectiveness of our specificity-controlled question generation model by evaluating it on the Amazon dataset using human judgements.
Chair: Dr. Hal Daumé III Dean's rep: Dr. Philip Resnik Members: Dr. Marine Carpuat Dr. Jordan Boyd-Graber Dr. David Jacobs Dr. Lucy Vanderwende