Using Document Summarization Techniques for Speech Data Subset Selection

Kai Wei, Yuzong Liu, Katrin Kirchhoff and Jeff Bilmes

In this paper we leverage methods from submodular function optimization developed for document summarization and apply them to the problem of subselecting acoustic data. We evaluate our results on data subset selection for a phone recognition task. Our framework shows significant improvements over random selection and previously proposed methods using a similar amount of resources.

