A solution without regular expressions

There are many ways for selecting columns in a delimited text. The easiest ones require GNU awk, cut and Perl. Let's consider the following file:

A draft of the new version of the book "Speech and Language Processing (3rd ed. draft)" by Dan Jurafsky and James H. Martin is available on authors' website. Although it is produced using pdflatex with hyperref, the bookmarks of the table of contents (TOC) have an incorrect hierarchy. However, the …