journal features
movie reviews
photo of the day

building a PDF library

the journal of Michael Werneburg

twenty-seven years and one million words

Tokyo, 2010.03.17

In building my business I do a lot of research. I have to, I'm a n00b and not a particularly gifted one at that.

I find myself routinely returning to a few websites in my research. Sites like Copyblogger, for instance, which is about as to-the-point as it gets for me these days. Great content, well presented .. only, damn difficult to navigate.

In fact, I've noticed that since the rise of Google an increasing number of website administrators seem to have given up on the fine art of organizing the content on their site. It's as if they've abdicated the idea of finding articles on their site to a Google search from the outset.

But Google's not always great for re-finding something you've already read, especially if the exact title doesn't stick in your head. So rather than return to Google to find something time and again, I've recently started saving PDF versions of the pages I need. To do this, I need two things:

1. "Readability", an outstanding tool that cuts away all of the chaf from article-based websites.

2. "Print-to-PDF" functionality.

Happily, with my Mac the latter is innate to the system. But with a Windows system you can find a few print-to-PDF tools for free. I can't recommend any of them over the others, but they all work the same way: as a faux printer to which you send formatted material.

So the trick is to get the "Readability" widget on your brower's toolbar. Go to a really useful page like this article on press releases, and click that widget. It will cut out all of the left column and right column ads and headers and links. It will also present a very readable version of the centre column's content in large serif font without disturbing justification tricks or fancy flows.

Then simply print it to a PDF document et voila. A very readable PDF document that you can find without resorting to sifting through Google results. As a bonus, it can be read offline.

P.S. I'm not picking on Copyblogger in particular, it's just that I go there every week and have been for nearly two years now (frequenting the same posts, like as not). Copyblogger is on WordPress, and I've heard complaints from other sites that use WordPress that the platform makes coherent organization of files difficult. At least they didn't go with one of those forum-based site designs, which are the antithesis of indexed, and are not even browsable.

rand()m quote

If it doesn't work, it isn't a failure. It's data.

—Dorrie Clark