By using hyper networks, the boffins can now pre-emptively fine-tune artificial neural networks, saving some of the time and expense of training.
Artificial intelligence is a numbers game. When deep neural networks, a form of AI that learns to discern patterns in data, began surpassing traditional algorithms 10 years ago, it was because we finally had enough data and processing power to make full use of them.
Training AI requires carefully tuning the values of millions or even billions of parameters that characterise these networks, representing the strengths of the connections between artificial neurons. The goal is to find ideal values for them, a process known as optimisation, but training the networks to reach this point isn't easy.
Boris Knyazev and his chums designed and trained a "hypernetwork" that speeds up the training process.
Given a new, untrained deep neural network designed for some tasks, the hypernetwork predicts the parameters for the new network in fractions of a second, and in theory could make training unnecessary.
Because the hypernetwork learns the extremely complex patterns in the designs of deep neural networks, the work may also have deeper theoretical implications.
For now, the hyper network performs surprisingly well in certain settings, but there's still room for it to grow -- which is only natural given the magnitude of the problem.