Motivation
Few-shot learning is typically done with a fixed neural architecture. This paper proposes MetaNAS, the first method which fully integrates NAS with gradient-based meta learning.
MetaNAS allows adapting architectures to novel tasks based on few data points with just a few steps of a gradient-based task optimizer. This allows MetaNAS to generate task-specific architectures that are adapted to every task separately (but from a joint meta- learned meta-architecture).
Ξ±metaβ : meta-learned architecture
wmetaβ : corresponding meta-learned weights for the architecture
Task Tiβ : (Ditrβ,Ditestβ)
Meta-objective:
==βΞ±,wminβLmetaβ(Ξ±,w,ptrain,Ξ¦k)Ξ±,wminβTiββΌptrainββLiβ(Ξ¦k(Ξ±,w,Ditrβ),Ditestβ)Ξ±,wminβTiββΌptrainββLiβ((Ξ±Tiβββ,wTiβββ),Ditestβ)β where Ξ±Tiβββ,wTiβββ=Ξ¦k(Ξ±,w,Ditrβ)=argminΞ±,wβL^iβ(Ξ±,w,Ditrβ) are the task-specific architecture and parameters after k gradient steps, which could approximated using SGD.
Liβ : query loss for task i
Liβ^β : support loss for task i
Inner loop update Ξ±,andΒ Β w with weight learning rate ΞΎtaskβ and architecture learning rate Ξ»taskβ :
Outer loop update:
Reptile could as be used instead of MAML here:
Task-dependent Architecture Adaptation
Two modifications to remove the need for retraining.
Reference: