Robert Važan

What's in a good software library?

High-level advice on API design from author of several opensource libraries and a seasoned developer.

Programming can be thought of as a form of communication with computers. There's even an AI philosophy that says that intelligent computers are merely ordinary computers that can be programmed quickly and easily. So what makes computers easy to program? Or conversely, what makes programmers productive?

If programming is communication, then libraries provide vocabulary for this communication. Relationship between programming languages and libraries is like relationship between intelligence and knowledge. Libraries and the vocabulary they provide (akin to knowledge) matter more than the access to raw computing power mediated by programming language (akin to intelligence).

Libraries have their limits. It's hard to deliver centralized services like database in the form of a library. Even where centralization is not strictly required, associated cloud services can make libraries smarter and reduce the amount of complexity visible to programmers. The goal of such library is not to embed a subprogram but rather to expose a convenient API. It doesn't matter that much what is behind the API, be it pure code, subprocess, or an associated online service. Pure code is nevertheless still preferred for ease of deployment and for its ability to function offline.

People often think of programming as telling computers what to do. But programming is a two-way communication. Visibility matters. Black box tools are hard to debug. Specificity of communication matters too. Reporting "bug at line 333" is way more useful than reporting "bug in the program". Libraries must ensure they can be debugged and profiled and that their state can be inspected. They need to speak and they need to speak clearly. That can be often accomplished with standard tools, but many non-trivial libraries need specialized diagnostic APIs or even external tools.

Simplicity and ease of use merely accelerate adoption of new libraries without making them fundamentally more useful. Fast adoption is nevertheless very important in the rapidly changing software landscape.

Interactivity of communication is important. It permits experimentation. Programming can be thought of as a dialogue. It is however still important to keep libraries predictable. No REPL loop can match the speed of internalized mental models. Not to mention that mental models permit partial simulation of the system while interactive shells require concrete implementation.

There's a difference between libraries and frameworks. As I see it, frameworks ask closed questions (e.g. Which serialization format do you prefer?) while libraries ask open-ended questions (e.g. What would you like to do today?). While frameworks may initially help with writer's block, one eventually runs into problems that are hard to solve with the framework. You find yourself trying to trick the framework into asking you the right question. Libraries don't add artificial restrictions on what you can say. You can combine vocabulary provided by the library however you want and you can even combine it with vocabularies provided by other libraries. This is called expressivity.

Allowing expression of complex ideas using library's vocabulary however shouldn't be an excuse to make it unreasonably hard to express simple ideas. Complexity should be always optional. This is usually done through reasonable defaults. Application developer can then get working solution quickly and incrementally improve it afterwards. Reasonable defaults are a big part of the already mentioned interactivity and experimentation.