Below are three simple things I like to do when starting a programming project, no matter how big or small.

  1. Initiate source code management (SCM) for your project. These days, it is incredibly easy to use a distributed SCM like Git or Mercurial and create a repository with a simple command in Git:

    git init myproject
    

    or Mercurial:

    hg init myproject
    

    Unlike days past where a lot of planning was necessary to set up a repository, particularly if it was a project with multiple developers, these modern SCMs let you track your history from the very beginning and be ready to collaborate on development. In some cases, such as in a corporate environment, centralized SCMs like Subversion or Microsoft's Team Foundation Server may be preferred. Nevertheless, start SCM right away.

  2. Pick an easy to use logging system for your language and use logging copiously in your code. I have generally written my own to get started, enough to open a log file and write log messages to the screen and a file simultaneously. Logging is important when developing high throughput, non-interactive data pipelines. It's infeasible to watch the screen 24 hours a day, so logs become invaluable when your pipeline or any software breaks while you are not in front of it. Make sure the log format is friendly for mining too. Completely free form text is difficult to parse and use with automated tools.

  3. Write the simplest useful program for your project. This might be as simple as putting a dialog on the screen, or parsing command line arguments. Once you get it working, commit your work in your SCM from bullet 1. At that point, you always have a working program you can show to others, and you have something you can test and regress against.

By doing these three things, you will have the ability to track and study the evolution of your project, the ability to track down errors that occur in the middle of the night or weekend when no one is watching, and always have a working product to fall back on.