How can a single person understand what's going on in a collection of millions of documents? This is an increasingly widespread problem: sifting through an organization's e-mails, understanding a decade worth of newspapers, or characterizing a scientific field's research. This monograph explores the ways that humans and computers make sense of document collections through tools called topic models. Topic models are a statistical framework that help users understand large document collections; not just to find individual documents but to understand the general themes present in the collection....
How can a single person understand what's going on in a collection of millions of documents? This is an increasingly widespread problem: sifting throu...