Universal fault-tolerant quantum computers have been theoretically proven capable of solving certain problems faster than any classical computer. Demonstrating this capability on a practically relevant problem using a small noisy quantum computer, however, is challenging. In the field of machine learning – where computers are trained using data to perform human-like tasks – researchers are particularly interested in whether quantum computers can provide an advantage over their classical counterparts.
In our Nature Computational Science cover article (pictured), “The power of quantum neural networks”,1 we collaborated with researchers from ETH Zurich to address exactly this question: Can quantum computers provide an advantage for machine learning?
What does it mean to have a quantum advantage in machine learning?
A quantum advantage refers to solving practically relevant problems better or faster with a quantum computer than with the best classical computer employing the best-known classical algorithm. There has been a lot of work trying to identify Read more in Patrick J. Cole's article, ‘Seeking quantum advantage for neural networks,’ A study based on effective dimension shows that a quantum neural network can have increased capability and trainability as compared to its classical counterpart.practical machine learning applications that may give rise to such a quantum advantage; however, this is still an active area of research in the rapidly growing field of quantum machine learning.
In our paper, we address this question from a different angle—namely, from the capacity of a model.
Simply put, a model’s capacity determines its ability to fit a variety of different functions. If one model can fit more functions than another, then the former is said to have a higher capacity than the latter. Understanding how to measure a model’s capacity is not that clear cut either.
The main technical contribution of our work motivates a new way to capture the capacity of any model (classical or quantum) with a measure called the effective dimension.
The effective dimension of neural networks
The effective dimension stems from statistics and information theory. It ties into concepts like Occam’s Razor, known more formally as the principle of Minimum Description Length, which allows us to formulate redundancy in various settings. Using the effective dimension as a capacity measure, we can quantify how much of a machine learning model is actually useful, versus how much of it is redundant. Naturally, we would want models with as little redundancy as possible, to ensure we are getting the most out of it.
Arguably, the best-performing machine learning models to-date are neural networks, used in applications like language translation and image recognition. State-of-the-art neural network models consist of millions (sometimes even billions) of parameters which are optimized to fit data. The effective dimension determines what proportion of these parameters are actively being used in the neural network.
A nice analogy for this idea is to think about clouds in the sky. Even though clouds exist in a three-dimensional space, the space that they actually occupy is only a fraction of this 3D space, their effective space (or effective dimension) is 1.37, not 3. Machine learning models like large neural networks exist similarly in very high-dimensional parameter spaces, but their true size, represented by effective dimension, is typically far smaller.
Tangibly different quantum results
We showed that the effective dimension can also interestingly give us an idea of how well a model will perform on new information. In other words, the effective dimension can be linked to the generalization error of a statistical model, which is simply the error a model makes on out-of-sample data. The next natural step was to explore what the effective dimension actually looks like for models in the classical and quantum regime. Do they look tangibly different?
We compared feed-forward neural networks to a particular type of “quantum neural network” motivated by previous studies. It turns out, the effective dimension of these two model classes can indeed be very different. The quantum neural networks were able to achieve significantly higher effective dimensions than their classical counterparts and we were able to demonstrate these results on today's hardware. Additionally, these high-effective-dimension quantum neural networks trained to lower loss values in fewer iterations, meaning that they could also fit data well.
Although these results reveal a lot of promise for quantum machine learning, there are still many open questions. For instance, why do these quantum neural networks produce such high effective dimensions? Is this true for more general classes of quantum models? And do the favorable results, like trainability, hold for larger models with more qubits?
Answering these questions is not trivial but could shed light on understanding the benefits of quantum machine learning in general. Going forward, we hope to address these open questions and others alike, with more work probing the power of quantum neural networks.