MathematikEnglisch
Veröffentlicht in Math ∩ Programming
Autor Jeremy Kun

A First Look at Google’s N-Gram Corpus In this post we will focus on the problem of finding the appropriate word boundaries in strings like “homebuiltairplanes”, as is common in web URLs like www.homebuiltairplanes.com. This is an interesting problem because humans do it so easily, but there is no obvious programmatic solution.