Robert Važan Mar 28, 2023

ChatGPT is a very old-fashioned programmer

ChatGPT is technically an artificial general intelligence and it does write code pretty well, but once you start challenging it, you notice certain limitations that seem to be characteristic of all language models, at least when they are applied to programming tasks naively and directly.

After having experimented with giving ChatGPT programming tasks for a while, I can summarize its key limitations as follows:

No access to API documentation. ChatGPT relies exclusively on whatever it remembers from its training. And when it fails to remember, it starts guessing or outright makes things up. You can paste fragments of documentation into the prompt to get better results, but that's inconvenient and limited.
No iterative improvement. There's a reason programming is called software development. It's an inherently iterative process consisting of numerous individual changes to code and program design. ChatGPT writes programs top-down, which is an impressive stunt if you think about it, but it's not likely to result in optimal output.
No testing. If it is untested, it is broken. I live by this rule. The closest ChatGPT can get to running the program is to explain how it works step by step. It is actually quite impressive that ChatGPT can often produce correct programs on first try. Testing is however necessary to increase quality of output. You can run the program yourself and paste output from compiler and the program itself into ChatGPT prompt, but that's very inconvenient.
Tight limit on program size. Since ChatGPT has no long-term memory nor any area where it could keep unbounded notes, it is inherently incapable of producing large programs. You can feed it one small task at a time, but that will result in gross inconsistencies between the various parts of the program, which ChatGPT will be unable to notice and fix.
No experience. Programmers are bone-headed about the way they do their job for a reason. They have a long list of personal rules and judgments they have learned, because these rules brought them results in the past. ChatGPT inherits some of these rules from the code it sees on the Internet, but it is unable to learn new rules when something breaks or it gets negative feedback. ChatGPT consequently tends to repeat the same mistakes over and over.

Entertainingly, this reminds me of my youth in 1990s, the age of "true programmers" who wrote their programs from start to end without testing them once. Back then, nobody read API documentation if there was any at all. People just used the few APIs they knew well. Short, algorithmically dense programs were a source of pride. There were also many short batches and scripts. You could say that ChatGPT is very much such an old-fashioned programmer from 1990s.

Of course, we will all stop laughing when ChatGPT improves or it gets a smarter competitor. Recently announced plugins will most likely allow it to look up documentation, verify code with compiler, and run unit tests as well as console programs. People are already experimenting with iterative use of language models (see LangChain for example). Note-keeping and IDE-like code navigation could be implemented as plugins, but true large memory would allow it to make local changes while keeping context of the entire project in mind.

Experience gathering does not really require real-time learning. The annual release process will likely incorporate feedback from the current user interactions. Monthly and weekly updates would bring it closer to human pace of learning. Private language models trained on relevant projects, including private datasets, are more likely to adopt job-specific skills and preferences.

So no, there wouldn't be a programmer productivity breakthrough with the current version of ChatGPT, but it is already saving time in its current state and it is likely to improve dramatically in this decade.