Posted at — May 22, 2022

Where is the current SOTA?

High-level idea of NoScope

Model Specialization

Key Assumption: only interested in identifying a small number of objects

Difference Detection

Two different kinds of difference detectors:

Difference detectors use MSE to measure difference between frames.

Difference detectors perform a difference check every $t_{skip}$ frames

Inference-optimized Model Search

This search method automatically identifies and learns an efficient model cascade which is able to reproduce the reference model's predictions on the video data, up to some desired accuracy. This is formalized as the following optimization problem: \(\max \qquad \mathbb{E}(throughput) \\\ \textnormal{s.t.} \ \textnormal{false positive} < FP^* , \textnormal{false negative} < FN^* \)

It searches the complete space of combinations of difference detectors and specialized models, and estimates the inference cost of each tool. It only considers spaces with one difference detector and one specialized model, since stacking model was not observed to help with performance.

The inference cost model is given by the following equation \( \begin{align} f_sT_{MSE} + f_sf_mT_{specialNN} + f_sf_mf_cT_{fullNN} \end{align} \)

The flow of models used in NoScope is as follows:

graph LR; DD-->specialNN; specialNN-->fullNN;


  1. Complex, nonlinear models
  2. Specialized models and DDs may be dependent on each other's performance
  3. Large search space, since the thresholds $\delta_{diff}, c_{low}, c_{high}$ must be learned, and are continuous values.


  1. NoScope uses grid search on different NN configurations to find the best model combination
  2. Run each trained model on every frame of an evaluation model, to determine selectivity (TNR) and sensitivity (TPR), to set thresholds $\delta_{diff}, c_{low}, c_{high}$.
  3. Linear sweep algorithm fo feasible threshold values.

Linear sweep

  1. For each difference detector D, sort the frames in decreasing order $\delta$.
  2. Compute the FPR and FNR for a given $\delta_{diff}$ threshold.
  3. For each specialNN, sort and set the $c_{low}, c_{high}$ to the desired FPR and FNR.
  4. Compute expected throughput using the cost model equation (1).


CBO complexity is $O(n_dn_cn_t)$, where $n_d$ is the number of DD configurations, $n_t$ is the number of specialNN configurations, $n_t$ is the number of firing thresholds. In practice, this is fast because $n_d,n_c,n_t$ are all less than $10^2$.


Possible ideas for exploration: