Tensor Networks and Language Models

Tensor Networks and Language Models

Unsupervised machine learning algorithms based on tensor networks can provide an excellent inductive bias for generative language models. I’ll share some ideas about this, including a big-picture overview and a mathematical “look under the hood” at a training algorithm based on the density matrix renormalization group (DMRG) procedure, which helps to explain the performance of these models. I’ll start with some of this motivation, then describe an elementary passage from classical to quantum probability theory, and—after a brief introduction to tensor network diagrams—use the passage as inspiration for taking a deeper look at the DMRG training algorithm.