> For the complete documentation index, see [llms.txt](https://lichangbin.gitbook.io/paper_notes/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://lichangbin.gitbook.io/paper_notes/nads-neural-architecture-distribution-search-for-uncertainty-awarenessrandy.md).

# NADS: Neural Architecture Distribution Search for Uncertainty Awareness

## Motivation

OOD errors are so common in machine learning systems when testing data dealt with is from a distribution different from training data.

The existing OOD detection approaches are prone to errors and even sometimes OOD examples are assigned with higher likelihoods.

There is currently no well established guiding principle for designing OOD detection architectures that can accurately quantify uncertainty.

NADS is proposed for the designing uncertainty-aware architectures, which could be used to optimize a stochastic OOD detection objective and construct an ensemble of models to perform OOD detection.

## Idea

![idea of the paper](/files/-MFxSYICPrDB24QeT0jx)

For simplicity, suppose the architecture has one operator ( $$K$$ candidate operations).

Each operation $$i$$ has a corresponding weight $$\phi\_i (i=1,\dots, K)$$ \[This could be done by DARTS]

![](/files/-MFxXcLU4zIf1ZctceVM)

In this special case, we can view$$\alpha$$ : architecture, $$\in{0,1}^K$$ \[comments by myself: if zero-operation is added, then $$\alpha$$ is a dummy vector. Otherwise, it's a one-hot vector] <https://machinelearningmastery.com/one-hot-encoding-for-categorical-data/>

Actually let $$b=\[b\_1,b\_2,\dots, b\_K] \in {0,1}^K$$ denote the random categorical indicator vector sampled from the probability vector$$\phi=\[\phi\_1,\phi\_2,\dots,\phi\_K]$$ . \[See the bottom figure from Tianhao's presentation]

![](/files/-MFxaFOomFhMSeNAwRUL)

Sampling $$\alpha$$ (is equivalent to $$b$$ in this setting) $$\sim Multi(\sigma (\phi))$$ \[it could be viewed as :  The probability of$$\alpha$$ is a function of $$\phi$$ $$P\_\phi(\alpha)$$ ]

Given $$\alpha$$ , we can calculate the random output $$y$$ of the hidden layer given input data $$x$$ :$$y=\sum\_{i=1}^{K} b\_i o\_i(x)$$&#x20;

The original objective: (maximizing the Widely Applicable Information Criteria (WAIC) of the training data)

![](/files/-MFxccmaBUAKdmpx7sni)

After Monte Carlo sampling (mentioned above):

![](/files/-MFxclu6QE0KJZcO5Y_8)

To make optimization tractable, Gumbel-Softmax reparameterization is used to relax the discrete mask $$b$$ to be a continuous random variable $$\tilde b$$ .

### Note:

* MC sampling
* flow-based generative model
* reparameterized method: Gumbel-Softmax reparameterization
* how it works for OOD detection?

## Reference

* <https://proceedings.icml.cc/static/paper_files/icml/2020/5738-Paper.pdf>
* <https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html>
* <https://ai.googleblog.com/2019/12/improving-out-of-distribution-detection.html>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://lichangbin.gitbook.io/paper_notes/nads-neural-architecture-distribution-search-for-uncertainty-awarenessrandy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
