Monday, March 31, 2008

Formating text for reading on small devices

I sometimes want to download files, say html, and convert them to txt format to read on a small portable device (an n770 or cough cough iPhone cough cough sellout cough).

For html (on Mac OS X - should work on linux too) files this is the command I use.

textutil -convert txt -strip printableArticle.jhtml.html 

The -strip should remove most of the html tags and preserve the formating. I suggest using the printable version of files from the interwebs as this usually has the complete text and usually has no ads. There should also be fewer links, markup, pictures etc. I guess a more unix approach should appear in the comments.

For pdfs I use the pdf tools.

pdftotext -layout filename.pdf

The -layout ensures that you should have most of the formating intact. FBreader is an excellent program for reading on the nokia tablets.

No comments: