it is crucially important to correct model mistakes quickly as they appear during training the neural network. In this work, we investigate the problem of neural network editing — how one can efficiently patch a mistake of the model on a particular sample, without influencing the model behavior on other samples. Namely, we propose Editable Training, a model-agnostic training technique that encourages fast editing of the trained model.
Editing Neural Networks
f(x,θ) : a neural network
Lbase(θ) : task-specific objective loss function
The goal is to change model's predictions on a subset of inputs, corresponding to misclassified objects, without affecting other inputs, by changing the model parameters θ .
An editor function could be used to formalized this: θ^=Edit(θ,le) , with a constraint: le(θ^)≤0
θ^ is the changed parameters
For example, multi-class classification.
le(θ^)=maxyi(logp(yi∣x,θ^)−logp(yref∣x,θ^)) where yref is the desired label.
if under the constraint: le(θ^)≤0 ,
The constraint: le(θ^)≤0 is satisfied iff argmaxyilogp(yi∣x,θ^)=yref
So the goal is how to design the editor neural network.