Probabilistic finite automata (PFAs) are common statistical language model in natural language and speech processing. A typical task for PFAs is to compute the probability of all strings that match a query pattern. An important special case of this problem is computing the probability of a string appearing as a prefix, suffix, or infix. These problems find use in many natural language processing tasks such word prediction and text error correction. Recently, we gave the first incremental algorithm to efficiently compute the infix probabilities of each prefix of a string (Cognetta et al., 2018). We develop an asymptotic improvement of that algorithm and solve the open problem of computing the infix probabilities of PFAs from streaming data, which is crucial when processing queries online and is the ultimate goal of the incremental approach.
展开▼