Web Page Size is Vital

When I first learned how to program a computer, optimization was a big deal. Figuring out how to squeeze every bit of performance out of a subroutine was difficult but rewarding. Articles were frequently written about how to best go about optimizing source code.

In the late 1990s, I began working on my first web applications. Bandwidth was expensive, so we worked on ways to make our websites more compact. We compressed web pages and figured out ways to strip out whitespace. However, today websites have quite a bit going on in the front of the house. There’s a lot of JavaScript and CSS that gets passed to the browser, and as a result, web applications are transmitting more data than ever.

Tammy Everts writes a blog called Web Performance Today, where she follows trends in web application development. It is essential for web developers to pay attention to the amount of data they send to users and how that affects application performance.

Ms Everts has shown over and over that web pages are growing. She points out that the average web page has grown 186% since 2010, and it shows no sign of stopping. I believe that every responsible web developer owes it to himself ((Or herself.)) to follow Ms Everts’ blog.

Please, fellow web developers, pay attention to how big your web pages are getting. Let’s reverse this trend.

Hypothes.is Web Annotation Tool

Hypothes.is LogoWhile working on the Philalethes E-Bulletin Online Reader, I came across a useful web-based annotation tool called Hypothes.is. It’s worth checking out. The tool uses a browser plugin to provide a number of cool features.

  • Annotation
  • Discussion
  • Tagging
  • Sharing
  • Privacy Control

It also provides an annotation stream that allows you to view public annotations as they’re being made all over the web.

I’ve installed the Hypothes.is WordPress plugin so you can experiment with Hypothes.is on this website. Please try it out!

Philalethes E-Bulletin Online Reader

Philalethes Society SealI began working on the Philalethes E-Bulletin in the Fall of 2013, and published the first issue in January of 2014. The E-Bulletin is published quarterly in EPUB and MOBI formats.

It’s been a great learning experience. Not only have I learned a lot about editing, but I’ve really had to dive into how electronic publishing works. The intricacies of electronic book formats have become well-known to me.

The Philalethes Society isn’t entirely comfortable with modern technology, however. Most complaints about the E-Bulletin came from those who didn’t have e-book readers and weren’t comfortable installing software on their PC to handle a new file format. Because of this, I built an online e-book reader specifically for the E-Bulletin.

The online e-book reader is based around the excellent EPUB.js library, with additional backend code written in PHP.

Click here to visit the Philalethes E-Bulletin Online Reader.

An English-language Stemmer for OCaml

A stemming algorithm attempts to reduce words to their stem. For instance, “swimming” would be reduced to “swim”, and “avocados” would become “avocado”. This is useful in a number of situations, most especially in searching text. This library is a direct port of the Porter English stemming algorithm.

It was one of my first OCaml projects. I wrote it back in 2003, when I was still new to the language. I had been spending a lot of time writing C libraries that were being called by Perl scripts for my day job. Perl has, or had, a cumbersome, messy interface to C that made such interfaces very difficult to write and maintain.

When I discovered how easy it was to link C libraries into OCaml, I was overjoyed! This was my first attempt. Before reading further, check out my ocaml-stemmer library on GitHub.

Updating the Code

Recently, while overhauling all of my publicly-available code, I decided to update my English-language stemmer for OCaml. It’s not a very large piece of code, but its age really shows. It wouldn’t compile cleanly with the latest version of OCaml. It looks like the code of somebody who hasn’t really grokked functional programming yet. Just look at this.

let rec replace_end word (rule_list : (int * string * string * int) list) =
  match rule_list with
      hd :: tl ->
        if (match_rule word hd) then
          let (rule, _, _, _) = hd in
            (rule, apply_rule word hd)
          replace_end word tl
    | [] ->
        (0, word)

Ouch, right?

I decided that the scary code would stand as a good message ((Or maybe I should say a good warning.)) to future functional programmers. For now, I just wanted to get this code to compile and not look messy. That ended up being easy.

Finding The Bug

Once I got it compiled cleanly, however, I found a bug. Back in 2003, I was big on test-driven development. I wrote tests for lots of code. The OCaml stemmer, it turns out, has been broken for quite a while. It doesn’t handle words with apostrophes correctly!

I thought that fixing the bug it would be a challenge. However, I quickly I discovered in the OCaml manual that the or operator was deprecated, and that || should be used instead. Embarrassingly, the or operator was deprecated back in 2002. That never should have been in the code! You can view the commit which fixed the bug here.

My Stemmer Library is Now on OPAM

This is my second library on OPAM, including my prime number library. You can view it on OPAM here.

Official Release of Libbucket 1.0.4

I’ve tagged version 1.0.4 of libbucket over on GitHub. You can download a tarball at this link. If you’d like to read more about libbucket, see my post from earlier this week.

Very handy edit:

You can download version 1.0.4 of libbucket and an OpenPGP signature here:


I never realized how much bucket clip-art existed.
I never realized how much bucket clip-art existed.

Small Team Software Change Management

GitHubUntil October, I’d been using a paid GitHub account to manage source code changes and issue tracking for private projects. GitHub is a software-as-a-service (SaaS) product providing a web-based interface for source control management and various project tracking tasks. Some people love it and some aren’t fond of it.

My software development clients are typically small companies wanting fairly simple web applications. They hire me because having a developer on staff doesn’t fit into their budget or business plan. They don’t usually care what the source code for their project looks like, but they do care about tracking issues.

Because of the scope of these applications, it’s rare that I work with other programmers. This meant that I wasn’t using any of the special features of GitHub for private code repositories, so in October I cancelled my subscription.

FreshBooksMy private repositories are now self-hosted, and I browse them using GitList, which bills itself as “an elegant and modern git repository viewer.” It looks nice, and I’ve got no complaints. For issue tracking, I use Freshbooks, a SaaS accounting system. With Freshbooks, I can not only keep track of bug reports and issues, but I can record time spent on bug reports, feature creep, and other client-related issues.

GitList and Freshbooks isn’t a perfect solution. At some point, I will be working with another developer, and we will need a way to track bugs and issues internally. When that happens, I plan to deploy Gitolite and find some new issue-tracking solution.

By the way, another reason I stopped using paid GitHub features is because they’ve already made plenty of money, and I’m not sure they’re doing the right things with all of that money.

I’m curious about what others are using. How does your incredibly small team track code changes and issues? Are all of your software issues internal, or are you developing for clients? I’d love to hear some ideas.

Modernizing libbucket

If you’re here to learn about my experience in software development, you’ve probably poked around my GitHub page. One the older projects on there is libbucket, a very fast dynamic string buffer library. I originally wrote it while working for Musician’s Friend, and was given permission to release it as an open sourced library in 2005.

Recently I decided to update the build system in the library, which was using an old version of autoconf and automake. I haven’t worked with those tools in a number of years. They are solid and flexible, but they’re also a confusing tangle of m4 macros and crazy shell scripts. Also, they change a lot.

A few important things had changed. For instance, aclocal wanted to read from configure.ac instead of configure.in. In addition, the AM_INIT_AUTOMAKE macro was completely different, but the tool was nice enough to point me to the relevant part of the automake manual.

Building a library is also a little different now than it was in 2002. GNU Libtool is a great program for building dynamic and shared libraries correctly for Unix systems, but its usage is different now. Luckily, it spit out all of the information I needed to update things.

One thing I didn’t quite figure out is how to get automake to recognize the README.org file as satisfying its README requirement. I ended up with an initialization block in configure.ac that looked like this:

AC_INIT([libbucket], [1.0.4])

You can see the unpleasant “cheat” on line 4. Sorry about that, world.

After all of that mess, there were just a couple of small fixes to the documentation, which is written in GNU Texinfo, and the library compiled just fine.

Unfortunately, I don’t have any tests. When I first developed libbucket, we had a proprietary test interface for C and C++ libraries at Musician’s Friend. That never got open sourced, so I had to remove it all before making the libbucket code public. Maybe tests are next.

If you’d like to take a look at the changes I made, here’s the Git commit.

Metaphorically similar to this kind of bucket.
Metaphorically similar to this kind of bucket.

PunchlinePDX Event Manager


PunchlinePDX is a slow-motion video booth for events and parties based out of Portland, Oregon. Earlier this year, I helped them develop event management software that would allow them to upload and curate video.

This was their first experience hiring a software developer, so I had the opportunity to walk them through the entire process. We started by outlining requirements and coming up with a solid plan with application screens, functions, and things to meet their business needs. We then brainstormed additional features and came up with something pretty amazing.

These are a few of the interesting things we came up with:

  • Cloud-backed storage for all videos
  • Text messaging interface
  • Smart social media sharing
  • Contact management
  • Event and sharing privacy

The best part was the testing process. While I ran them through their new software, they made slow-motion video of me and used the software to upload and manage it. Check it out!

You should seriously consider booking these guys for your holiday party.

Handy Tools for the Bourne Again Shell

If you’re a Unix geek, you’ve probably used bash, the Bourne-Again Shell. If you’ve been around a while, you’ve probably spent a lot of time customizing bash.

Back when I worked for Yahoo!, my friend Bryan gave me a great directory stack for bash. I loved it, so I rewrote it and have been hacking on it and using it ever since. Observe.

[user@host:~]$ cd /tmp
[user@host:/tmp]$ cd /var/log
[user@host:/var/log]$ dl
1   ~
2   /tmp
3 * /var/log
[user@host:/var/log]$ go 1
1   /home/user
[user@host:~]$ go log
3   /var/log

In addition to the handy new go command, it also includes b and f for moving backwards and forwards on the stack. It was inspired by pushd and popd, but it’s so much more.

If you’d like to check it out, take a look at my bashtools repository on GitHub or just download version 1.0. I don’t change it very often, but I’m thinking of hammering out a long-standing bug in the directory stack code.

N.B. If you use all of my scripts, you’ll get some great prompts for your xterms. You’re welcome!

Not this kind of shell.
Not this kind of shell.