| Welche | Farbe | hat | der | gelbe | Bus? | 
|---|---|---|---|---|---|
| Which | color | has | the | yellow | bus? | 
As I have learned, linguists have the fancy word Interlinear Gloss for this. There are several Latex packages available for this purpose. Among them is gb4e which I decided to use.
For simplicity it is assumed that the text to be 'glossed' is provided as a plain text file with sentences delimited by '. ', '? ' or '! ' (2 spaces) and words separated by individual spaces. The implementation of a small script that creates a document with nicely aligned words is very straight forward. The dictionary needs to be provided as a .csv file.
Unfortunately the task can not be fully automated. Breaking text into sentences requires some knowledge about a specific language. So does breaking sentences into words. Ideally the dictionary should also have some capability to detect flections, etc. The script just generates a Latex file that can be modified manually.
The script can be downloaded here.
 
 
No comments:
Post a Comment