In an earlier post, I sounded off on the topic of documentation. Documenting your work is essential to making smooth progress in data mining, and in most other kinds of exploratory knowledge work. At its simplest, documentation is the aide-mémoire that saves you time when repeating a complex task, or just some notes to yourself about where you left off on a project so you can get right back at it the next day. At its best, it’s a transfer of knowledge that transcends time and staff changes, with great benefits for the very health of the organization you work for. (And you do care about the organization you work for, don’t you?)
Today I’ll talk about tools for recording and storing documentation, and content, i.e. what things effective documentation should include.
You don’t need much in the way of tools. I use only two, one for text and one for images: Microsoft Word and IrfanView. I use Microsoft Word only for its ubiquity and therefore shareability; but any Word processor or page layout program that can incorporate images is fine. I use IrfanView, a free image viewer, because it has enough features for my purposes and no more.
(You can use whatever you wish, but keep it simple. I won’t tell you how to use Word, but here’s a tip for IrfanView. When you need to capture a visual from your screen, to show the layout of certain key items in a software interface for example, use your computer’s screen capture function to save a copy of the screen image to the clipboard. In IrfanView, press Ctrl-V (on a PC) to paste the captured image. Click and drag with the mouse to select the area you want, and press Ctrl-Y to crop the image. No need to save the image as a file; just press Ctrl-C to copy, move over to your open Word doc, and press Ctrl-V to paste the image where your cursor is. Done.)
I’m making some assumptions about how you’re going to store and possibly disseminate your documentation. I’ve opted for electronic capture, not only because the majority of the knowledge workers’ craft is conducted via the computer screen, but because it facilities central storage and encourages regular and seamless updating — a must for useful documentation.
Another assumption is that you’re creating documents that either make up your own personal store, or are shared with a limited number of people — no more than a dozen. For large audiences, you might consider other capture and distribution options such as wikis, intranet pages, video captures, or podcasts, but I’m deliberately steering clear of these options because the goal is to make documentation second nature: You need to use tools that can readily be called up and pushed aside over and over. And really, there are few workaday processes that cannot be adequately described with words and images alone. (Certain complicated surgeries excepted.)
You should immediately recognize when something you’re doing is a PROCESS (i.e., involves more than one or two steps), and then you should get it down, right then and there, WHILE YOU ARE DOING IT. For that, the activity has to be as painless and unobtrusive as possible. That’s all I have to say about tools — no fancy software, just the basics.
As for content, obviously that’s up to you and it depends on the size and complexity of the task you’re trying to document. But good documentation has the following attributes in common:
Separate files: I think each task should be created as a separate document. Gigantic manuals of hundreds of pages that describe your entire job are going to be very difficult to update selectively, and they’re cumbersome to share. (Think of each documented process as a recipe on a file card.) Create separate documents that address specific tasks, and give them descriptive file names to make them easier to find. Include the revision date in the file name.
Central repository: That’s fancy talk for putting your (electronic) files on a shared drive where others can find them, and where there is a chance that they will be backed up regularly (inquire about that). Don’t save important documents to your hard drive, which will eventually fail.
Electronic version is THE version: Make it a rule that the version saved to the shared drive is the master copy. The master copy is always the most up-to-date version. When I need to use a file for documentation, I will print it off and refer to it as I work. I mark it up as I work with any improvements or revisions that I see are needed, and when I have time, I revise the master copy and throw the paper copy in the shredder bin. (If you have two monitors, revise the electronic copy as you go and don’t bother with paper.)
Iterative updates: Processes change all the time, and documentation must keep up. Rather than trying to adhere to some sort of artificial schedule for document review, make changes at the time you’re actually using the process, as suggested above. Small changes over time are easy to manage and really add up.
Creation and revision dates: The revision date is in the file name, but put both the creation and revision dates at the head of the document as well. That way, you can compare a printed copy of the file with what is stored on the shared drive. Oh, and do NOT ever use a field that autopopulates with today’s date; every time someone opens the document, it will display the current date, making it hard to figure out how old the document is.
Context: Are there previously-documented processes that precede what you’re documenting now? Are there documented steps that follow? Make reference to them, with instructions for where to find them. If someone’s described it better than you can, don’t reinvent the wheel. Sometimes the software manual is the best guide. Not often, mind you, but sometimes!
Software versions: If your process relies on software, make note of the version number near the head of your document. Processes frequently change as a result of major upgrades — this is particularly true of databases — and any upgrade might trigger the need for a revision. Again, documentation never stands still.
Prerequisites: Does the process presume the user has access to a database, database table, service, server, or other resource that normally requires a password or account? List those requirements near the beginning, if the process is likely to be needed by someone new. In large organizations, getting an account set up can itself be a time-consuming process. If you’ve figured it out, don’t just be satisfied to finally be “in”. Help the next person get on board.
Bulleted or numbered lists: Incredibly helpful for someone needing to carry out a series of steps in a certain order. Learn how to quickly format a list — numbers if order is important, bullets if not. If the process is long and complex, start by providing a summary of the basic stages of the process, and then go into detail; the summary will help keep the user oriented during the process.
Recipe, not manual: If your document needs a table of contents or an index, you’re probably cramming too much into it. The core of your document should be: “do this, then do this, then do this.” Like a recipe. Consider adding appendices to hold discussions of side-issues, large tables of information, alternative processes, older versions of the process that might need to be called up again, and so on. Keep major diversions out of the main body of the document.
Numbered pages: Duh.
Background knowledge brought to foreground: Is your document useful to someone who isn’t you? If you are documenting a process you are already familiar with, the odds are stacked against you. The reason is that knowledge mastered tends to be internalized and pushed to the background. You just assume (wrongly) that the next schlub to come along already knows what you know. Keep the beginner in mind, and spell out what you think is obvious. This is not easy, however: It is so much better if you document the process AS YOU ARE LEARNING IT YOURSELF. Learning and documenting are NOT separate stages.
Visuals, when necessary: Text will often suffice, but sometimes it takes a screen shot to make the point clear.
Expectations and exceptions: Let the user in on what results to expect. If it’s likely going to take half an hour to run a job, provide a warning. Also, anticipate some of the possible errors someone other than yourself might encounter, and suggest fixes. Many processes have critical points where a tiny mistake that is easy to miss will throw things off the rails — step into your ‘beginner mind’ and flag those points.
Explain “why”: Sometimes the most worst thing about following a set of steps for the user is having to wonder, “WHY am I doing this?” People will be tempted to skip steps that don’t seem to have any reason behind them — they learn the reason only when things go wrong. Here’s a simple example of explaining the Why which can save a person some head-scratching: “In the next block, enter ’02:00′ as the Submit Time. This will ensure that the job runs immediately. Setting another time or leaving the field blank may cause the job to be queued to run overnight.”
Good spelling and Shakespearean powers of expression are optional, as are fancy software tools. If you care about what you’re doing, you care enough to document, and that is all you really need.