Go 2022-2024 and beyond: Let’s talk about AI

In this article, I’ll talk about where we are with Go today and what’s coming next, specifically in the context of how generative AI is changing software development. (Discussion hosted on LinkedIn)

The last few years have been about maturing the Go platform for mainstream users. We added generics, addressing the top language feature request since Go 1.0. We added feature flags for backwards compatibility, which enable major systems like Kubernetes to extend their support windows. We added automatic updates for forward compatibility, which enable us to fix longstanding language issues like the range variable capture “gotcha”. And we greatly improved the software supply chain security of the Go project itself.

We’ve made major improvements to Go’s IDE support in VS Code and in gopls, the Go language server. These now scale to much larger code bases and support a variety of static analyses. We’re improving support for refactoring, and we recently added transparent toolchain telemetry, which will enable us to make data-driven improvements to the developer experience. Please opt-in to telemetry with “gotelemetry on” to help us make Go better for you.

We also improved Go’s support for production systems. We added structured logging to the standard library and improved support for HTTP routing. We enhanced code coverage to support integration tests. We added vulnerability management, a critical requirement for securing enterprise systems; and we made vulnerability triage much more efficient by using static analysis and the Go vulnerability database to automatically dismiss false positives (40% of reports). Finally, we launched profile-guided optimization (PGO), which has delivered great efficiency gains and sets us up to deliver much more.

On the business front, Go continues its strong growth as the language of choice for scalable cloud applications. The cloud market is growing at a compound annual growth rate (CAGR) of over 15%, so the future is very bright for the Go ecosystem.

So what’s next? Surprising no one: AI

AI—specifically, generative AI using large language models (LLMs)—seems to be all anyone talks about in tech nowadays.

The reaction from many programmers has been skepticism: sure, these models generate text that sounds correct, code that seems right, images that look nice, but on closer inspection it’s full of errors (and extra fingers). How can we possibly build anything trustworthy on top of something so unreliable?

But then I recall that the Internet is built on unreliable networks, that early Google was built on commodity hardware, that successful organizations are a collection of fallible people. I think about all the techniques we’ve developed to build reliable systems out of unreliable components. There’s always an efficiency cost to doing this (see The Tail at Scale), but it’s possible. And as the underlying components become more reliable (for example, as AI models improve), the whole system can become more efficient. AI presents a new kind of unreliability for us to understand and engineer around, but I’m optimistic that we’ll figure it out over time.

Programmers mainly engage with AI on two fronts: AI assistance (using AI ourselves to be more productive) and AI applications (building software that uses AI to serve our users better). We are investigating both these areas for Go. I’ll write about AI assistance in this article and explore AI applications (and Go’s relationship to Python) in future articles.

Before we dive in, I’ll speak to one concern: Many people have predicted that AI will make programming languages obsolete in favor of natural language. I disagree, because programming languages enable humans to specify what they want precisely, collaborate with other programmers, and debug and maintain software over time. AI will enable some use of natural language for these tasks, and AI has already proven useful atop low-code and no-code systems; but I believe programming languages will remain relevant to software development and operation for many years to come. (I am, of course, biased towards believing this. As Upton Sinclair said, “It is difficult to get a man to understand something when his salary depends on his not understanding it”. Check out Cassie Kozyrkov’s post for another viewpoint.)

AI developer assistance most often takes the form of code completion in the IDE. AI can also help generate code (for example, from a natural language description); explain code (roughly the reverse of generation); translate code between different programming languages; generate documentation, examples, or tests; explain and fix compilation errors, runtime errors, or test failures; answer general knowledge, onboarding, and “how to” questions; and more. AI developer assistance is available from a variety of providers using a variety of models, and some providers make it possible to train custom models on an organization’s private code to enable domain-specific responses.

AI-generated code often has errors, but so does human-generated code. Programmers have developed a variety of ways to validate whether code does what we intended: static checks (like type checking), dynamic checks (like thread sanitizers), model checking, unit tests, integration tests, fuzz tests, runtime monitoring, and more. We can apply validators to check the code generated by AI, and we can also use them to validate the code used to train AI models. An open question is how the errors that AI makes differ from those humans make, and whether we need new kinds of validators to catch those errors.

There’s an interesting connection between AI training data and software supply chain security (S3C): many of the code qualities we want from our training data are also the qualities we want from our dependencies. There may be an opportunity to align the work being done by organizations like the Open Source Security Foundation (OpenSSF) with the needs for training AI models on high-quality open source code.

I believe most programmers will use AI assistance, so we have prioritized making AI assistance great for Go developers. We are investigating:

  • How do we improve the quality of Go code generated by AI models? Can we differentiate between “good code” and “bad code” so that models can learn the difference? Is there value in synthesizing Go code for additional training data? Can we automatically fix errors in the training data with refactoring tools, then train models to make the same fixes? (Tools to identify good code and fix bad code are useful to programmers on their own, and they are also useful for AI and S3C.)
  • If models train on existing open source code, how do they learn to generate code that uses newly introduced language features and libraries? Can we “modernize” training data with refactoring tools, so that the models learn to use the latest idioms? Similar questions apply to training models that must produce code that has specific safety, security, or compliance properties.
  • How do we evaluate whether an AI model generates good Go code? Evaluation criteria are critical to enabling models to improve over time. What are the prompts and responses in this evaluation set? Should such evaluations be open source benchmarks, so we can compare the performance of different models?
  • How should IDEs prompt models to generate good Go code? What needs to be included in the prompt? Do IDEs need to understand Go workspace layout in order to provide the right context in the prompt? Do they need to fetch dependency code via RAG and include that in the prompt?

Today, each AI assistance provider has to address these issues independently for each programming language they want to support. We’re looking at this from the language provider’s point of view, trying to understand how we can scale good quality AI assistance to many models and providers. All these questions also apply to the other programming languages people use, besides Go. I would be happy to see coordination between programming language projects on addressing these issues, so that we make AI assistance better for everyone.

Special thanks to Hana Kim for reviewing a draft of this article and providing great suggestions. Hana leads work on the VS Code Go plugin—the most widely used Go IDE—and is investigating how we can make AI assistance great for Go.

Author: Sameer Ajmani

Engineering Director at Google