2013-03-08

Remarks on Latex Spell Checking

This post focuses on some remarks how to improve the language of a (bigger) Latex document. A lot of the time technical issues (compiling, fixing syntax errors, adjusting images, tables, etc.) and the Latex typesetting itself ('How do I ... in Latex?') draw away a lot of the attention from the actual content of the document. I think that ordinary word processors like LibreOffice Writer have a significant advantage over Latex here, even though the result will not be so beautiful. This post discusses some techniques I found helpful to mitigate this problem.

You probably want to improve the quality of a document in several stages. There is (should be) spell checking happening on an everyday basis and after certain periods of time you will want to do bigger reviews to improve the overall consistency of the document.

The first thing would be to use the editors integrated spell checking capabilities. In Emacs I found Flyspell Mode quite convenient (M-x flyspell-mode). Otherwise ispell can be run from within emacs (M-x ispell-buffer). The downside of this is, that it will generate lots of false positives if you have a more technical document with lots of acronyms and technical expressions. Therefore it might be quite distracting to have lots of words on the screen marked.

Alternatively you might want to generate spelling reports for your whole document once in a while. The following short script can be used to generate a spelling report for several .tex files using Hunspell.

Usually bigger Latex documents will be spread over many different files. Finding some string in several files and opening every file that contains the string can be quickly accomplished by issuing:
$> find . -name "*.tex" | xargs grep "some word" -isl | xargs emacs

More sophisticated spell checking and grammar checking can be done using LanguageTool. Unfortunately it cannot be used with Latex directly. detex can be used to remove Tex commands from Latex files. This is a bit tedious, because it gives you lots of false positives, but you will probably discover some new language mistakes this way.
$> find ./chapters/ -name "*.tex" -exec detex -n {} \; >> doc_detexed.txt
$> java -jar LanguageTool.jar -l en-US -c utf-8 doc_detexed.txt > doc_languagetool_report.txt

Microsoft Office spell checking and grammar checking is superior to the tools mentioned above. To be able to open your Latex document in Microsoft Word, the tool latex2rtf can be used. Instead of compiling your document to PDF it generates an .rtf file. This is also an alternative to using detex, if you want to use Libre Office and LanguageTool. If you do not have access to an Microsoft Office installation, GDocs might also be an alternative.

Xournal allows you to annotate PDF documents. This is handy because directly writing annotations in the PDF allows you to really focus on the content and saves lots of paper if you would otherwise print intermediate stages of your document frequently.

2013-03-02

Print PDFs by Creating Multiple Print Jobs

Printing long PDF documents is sometimes tedious, especially if you have a dull network printer. You might know the case that for mysterious reasons the printer just does nothing for a very long time and thereafter it seems to have forgotten about the actual print job.

Sometimes printing can still be accomplished by sending only a few pages of the PDF document in distinct print jobs. Doing this manually is also tedious, so here is a simple Python script that breaks up a PDF document into many PDFs with just 2 pages (using pdftk) and spools them via lpr (the package cups-pdf is required for this).