Merge pull request #72 from justadudewhohacks/tiny-yolov2-seperable-conv2d

Tiny yolov2 seperable conv2d

Merge pull request #72 from justadudewhohacks/tiny-yolov2-seperable-conv2d
5fefdcd5 · justadudewhohacks · GitHub · 661f228b · 4b4ecdb0 · 5fefdcd5
Unverified Commit 5fefdcd5 authored Aug 17, 2018 by justadudewhohacks Committed by GitHub Aug 17, 2018
65 changed files
--- a/.gitignore
+++ b/.gitignore
@@ -2,4 +2,6 @@ node_modules
 .rpt2_cache
 .env*
 tmp
+proto
 weights_uncompressed
+weights_unused
\ No newline at end of file
--- a/.npmignore
+++ b/.npmignore
@@ -5,5 +5,6 @@ examples
 proto
 weights
 weights_uncompressed
+weights_unused
 test
 tools
\ No newline at end of file
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@ Table of Contents:
  * **[Face Detection & 5 Point Face Landmarks - MTCNN](#usage-face-detection-mtcnn)**
  * **[Face Recognition](#usage-face-recognition)**
  * **[68 Point Face Landmark Detection](#usage-face-landmark-detection)**
-  * **[Full Face Detection and Recognition Pipeline](#usage-full-face-detection-and-recognition-pipeline)**
+  * **[Shortcut Functions for Full Face Description](#shortcut-functions)**

 ## Examples

@@ -89,15 +89,15 @@ The face detection model has been trained on the [WIDERFACE dataset](http://mmla

 ### Face Detection - Tiny Yolo v2

-The Tiny Yolo v2 based face detector can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the model is not as accurate as SSD Mobilenet v1 but can achieve faster inference for lower image sizes.
+The Tiny Yolo v2 implementation is a very performant face detector, which can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the models ability to locate smaller face bounding boxes is not as accurate as SSD Mobilenet v1. 

-The Tiny Yolo v2 implementation is still experimental, meaning there is room for optimization (future work). The trained model weights are provided in the [azFace](https://github.com/azmathmoosa/azFace) project.
+The face detector has been trained on a custom dataset of ~10K images labeled with bounding boxes and uses depthwise separable convolutions instead of regular convolutions, which ensures very fast inference and allows to have a quantized model size of only 1.7MB making the model extremely mobile and web friendly. Thus, the Tiny Yolo v2 face detector should be your GO-TO face detector on mobile devices.

 <a name="about-face-detection-mtcnn"></a>

 ### Face Detection & 5 Point Face Landmarks - MTCNN

-MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration and is able to achieve much lower processing times. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with 2MB, the size of the weights file is only a third of the size of the quantized SSD Mobilenet v1 model (~6MB).
+MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration. By tuning the input parameters, MTCNN is able to detect a wide range of face bounding box sizes. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with the model size is only 2MB.

 MTCNN has been presented in the paper [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf) by Zhang et al. and the model weights are provided in the official [repo](https://github.com/kpzhang93/MTCNN_face_detection_alignment) of the MTCNN implementation.

@@ -164,7 +164,7 @@ await net.load('/models/face_detection_model-weights_manifest.json')
 // await net.load('/models/face_landmark_68_model-weights_manifest.json')
 // await net.load('/models/face_recognition_model-weights_manifest.json')
 // await net.load('/models/mtcnn_model-weights_manifest.json')
-// await net.load('/models/tiny_yolov2_model-weights_manifest.json')
+// await net.load('/models/tiny_yolov2_separable_conv_model-weights_manifest.json')

 // or simply load all models
 await net.load('/models')
@@ -197,7 +197,7 @@ const maxResults = 10

 // inputs can be html canvas, img or video element or their ids ...
 const myImg = document.getElementById('myImg')
-const detections = await faceapi.locateFaces(myImg, minConfidence, maxResults)
+const detections = await faceapi.ssdMobilenetv1(myImg, minConfidence, maxResults)
 ```

 Draw the detected faces to a canvas:
@@ -356,7 +356,7 @@ const rightEyeBrow = landmarks.getRightEyeBrow()
 Compute the Face Landmarks for Detected Faces:

 ``` javascript
-const detections = await faceapi.locateFaces(input)
+const detections = await faceapi.ssdMobilenetv1(input)

 // get the face tensors from the image (have to be disposed manually)
 const faceTensors = await faceapi.extractFaceTensors(input, detections)
@@ -366,50 +366,35 @@ const landmarksByFace = await Promise.all(faceTensors.map(t => faceapi.detectLan
 faceTensors.forEach(t => t.dispose())
 ```

-<a name="usage-full-face-detection-and-recognition-pipeline"></a>
+<a name="shortcut-functions"></a>

-### Full Face Detection and Recognition Pipeline
+### Shortcut Functions for Full Face Description

-After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows:
+After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood by providing convenient shortcut functions. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows.

-``` javascript
-const fullFaceDescriptions = await faceapi.allFaces(input, minConfidence)
-
-const fullFaceDescription0 = fullFaceDescriptions[0]
-console.log(fullFaceDescription0.detection) // bounding box & score
-console.log(fullFaceDescription0.landmarks) // 68 point face landmarks
-console.log(fullFaceDescription0.descriptor) // face descriptor
+Using the SSD Mobilenet v1 face detector + 68 point face landmark detector:

+``` javascript
+const fullFaceDescriptions = await faceapi.allFacesSsdMobilenetv1(input, minConfidence)
 ```

-You can also do everything manually as shown in the following:
+Using the Tiny Yolo v2 face detector + 68 point face landmark detector:

 ``` javascript
-// first detect the face locations
-const detections = await faceapi.locateFaces(input, minConfidence)
-
-// get the face tensors from the image (have to be disposed manually)
-const faceTensors = (await faceapi.extractFaceTensors(input, detections))
-
-// detect landmarks and get the aligned face image bounding boxes
-const alignedFaceBoxes = await Promise.all(faceTensors.map(
-  async (faceTensor, i) => {
-    const faceLandmarks = await faceapi.detectLandmarks(faceTensor)
-    return faceLandmarks.align(detections[i])
-  }
-))
+const fullFaceDescriptions = await faceapi.allFacesTinyYolov2(input, { inputSize: 'md' })
+```

-// free memory for face image tensors after we detected the face landmarks
-faceTensors.forEach(t => t.dispose())
+Or with MTCNN face detection + 5 point face landmarks:

-// get the face tensors for the aligned face images from the image (have to be disposed manually)
-const alignedFaceTensors = (await faceapi.extractFaceTensors(input, alignedFaceBoxes))
+``` javascript
+const fullFaceDescriptions = await faceapi.allFacesMtcnn(input, { minFaceSize: 20 })
+```

-// compute the face descriptors from the aligned face images
-const descriptors = await Promise.all(alignedFaceTensors.map(
-  faceTensor => faceapi.computeFaceDescriptor(faceTensor)
-))
+The shortcut functions return an array of FullFaceDescriptions:

-// free memory for face image tensors after we computed their descriptors
-alignedFaceTensors.forEach(t => t.dispose())
+``` javascript
+const fullFaceDescription0 = fullFaceDescriptions[0]
+console.log(fullFaceDescription0.detection) // bounding box & score
+console.log(fullFaceDescription0.landmarks) // face landmarks
+console.log(fullFaceDescription0.descriptor) // face descriptor
 ```
\ No newline at end of file
--- a/examples/public/commons.js
+++ b/examples/public/commons.js
@@ -147,6 +147,10 @@ function renderNavBar(navbarId, exampleUri) {
      name: 'Tiny Yolov2 Face Detection Webcam'
    },
    {
+      uri: 'tiny_yolov2_face_recognition',
+      name: 'Tiny Yolov2 Face Recognition'
+    },
+    {
      uri: 'batch_face_landmarks',
      name: 'Batch Face Landmarks'
    },

--- a/examples/server.js
+++ b/examples/server.js
@@ -32,6 +32,7 @@ app.get('/mtcnn_face_recognition_webcam', (req, res) => res.sendFile(path.join(v
 app.get('/tiny_yolov2_face_detection', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetection.html')))
 app.get('/tiny_yolov2_face_detection_video', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionVideo.html')))
 app.get('/tiny_yolov2_face_detection_webcam', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionWebcam.html')))
+app.get('/tiny_yolov2_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceRecognition.html')))
 app.get('/batch_face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceLandmarks.html')))
 app.get('/batch_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceRecognition.html')))


--- a/examples/views/faceDetectionVideo.html
+++ b/examples/views/faceDetectionVideo.html
@@ -53,6 +53,15 @@
    let modelLoaded = false
    let result

+    let forwardTimes = []
+
+    function updateTimeStats(timeInMs) {
+      forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
+      const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
+      $('#time').val(`${Math.round(avgTimeInMs)} ms`)
+      $('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
+    }
+
    function onIncreaseThreshold() {
      minConfidence = Math.min(faceapi.round(minConfidence + 0.1), 1.0)
      $('#minConfidence').val(minConfidence)
@@ -63,11 +72,6 @@
      $('#minConfidence').val(minConfidence)
    }

-    function displayTimeStats(timeInMs) {
-      $('#time').val(`${timeInMs} ms`)
-      $('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
-    }
-
    async function onPlay(videoEl) {
      if(videoEl.paused || videoEl.ended || !modelLoaded)
        return false
@@ -79,7 +83,7 @@

      const ts = Date.now()
      result = await faceapi.locateFaces(videoEl, minConfidence)
-      displayTimeStats(Date.now() - ts)
+      updateTimeStats(Date.now() - ts)

      faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
      setTimeout(() => onPlay(videoEl))

--- a/examples/views/tinyYolov2FaceDetectionVideo.html
+++ b/examples/views/tinyYolov2FaceDetectionVideo.html
@@ -63,6 +63,15 @@
    let sizeType = 'md'
    let modelLoaded = false

+    let forwardTimes = []
+
+    function updateTimeStats(timeInMs) {
+      forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
+      const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
+      $('#time').val(`${Math.round(avgTimeInMs)} ms`)
+      $('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
+    }
+
    function onIncreaseThreshold() {
      scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
      $('#scoreThreshold').val(scoreThreshold)
@@ -78,11 +87,6 @@
      $('#sizeType').val(sizeType)
    }

-    function displayTimeStats(timeInMs) {
-      $('#time').val(`${timeInMs} ms`)
-      $('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
-    }
-
    async function onPlay(videoEl) {
      if(videoEl.paused || videoEl.ended || !modelLoaded)
        return false
@@ -99,7 +103,7 @@

      const ts = Date.now()
      result = await faceapi.tinyYolov2(videoEl, forwardParams)
-      displayTimeStats(Date.now() - ts)
+      updateTimeStats(Date.now() - ts)

      faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
      setTimeout(() => onPlay(videoEl))

--- a/examples/views/tinyYolov2FaceDetectionWebcam.html
+++ b/examples/views/tinyYolov2FaceDetectionWebcam.html
@@ -64,6 +64,15 @@
    let sizeType = '160'
    let modelLoaded = false

+    let forwardTimes = []
+
+    function updateTimeStats(timeInMs) {
+      forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
+      const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
+      $('#time').val(`${Math.round(avgTimeInMs)} ms`)
+      $('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
+    }
+
    function onIncreaseThreshold() {
      scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
      $('#scoreThreshold').val(scoreThreshold)
@@ -79,11 +88,6 @@
      $('#sizeType').val(sizeType)
    }

-    function displayTimeStats(timeInMs) {
-      $('#time').val(`${timeInMs} ms`)
-      $('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
-    }
-
    async function onPlay(videoEl) {
      if(videoEl.paused || videoEl.ended || !modelLoaded)
        return false
@@ -100,7 +104,7 @@

      const ts = Date.now()
      result = await faceapi.tinyYolov2(videoEl, forwardParams)
-      displayTimeStats(Date.now() - ts)
+      updateTimeStats(Date.now() - ts)

      faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
      setTimeout(() => onPlay(videoEl))
@@ -124,7 +128,7 @@
    }

    $(document).ready(function() {
-      renderNavBar('#navbar', 'tiny_yolov2_face_detection_video')
+      renderNavBar('#navbar', 'tiny_yolov2_face_detection_webcam')

      const sizeTypeSelect = $('#sizeType')
      sizeTypeSelect.val(sizeType)

--- a/examples/views/tinyYolov2FaceRecognition.html
+++ b/examples/views/tinyYolov2FaceRecognition.html
+<!DOCTYPE html>
+<html>
+<head>
+  <script src="face-api.js"></script>
+  <script src="commons.js"></script>
+  <link rel="stylesheet" href="styles.css">
+  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
+  <script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
+</head>
+<body>
+  <div id="navbar"></div>
+  <div class="center-content page-container">
+    <div class="progress" id="loader">
+      <div class="indeterminate"></div>
+    </div>
+    <div style="position: relative" class="margin">
+      <img id="inputImg" src="" style="max-width: 800px;" />
+      <canvas id="overlay" />
+    </div>
+    <div class="row side-by-side">
+      <div id="selectList"></div>
+      <div class="row">
+        <label for="imgUrlInput">Get image from URL:</label>
+        <input id="imgUrlInput" type="text" class="bold">
+      </div>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="loadImageFromUrl()"
+      >
+        Ok
+      </button>
+      <p>
+        <input type="checkbox" id="useBatchProcessing" onchange="onChangeUseBatchProcessing(event)" />
+        <label for="useBatchProcessing">Use Batch Processing</label>
+      </p>
+    </div>
+    <div class="row side-by-side">
+      <div class="row input-field" style="margin-right: 20px;">
+        <select id="sizeType">
+          <option value="" disabled selected>Input Size:</option>
+          <option value="xs">XS: 224 x 224</option>
+          <option value="sm">SM: 320 x 320</option>
+          <option value="md">MD: 416 x 416</option>
+          <option value="lg">LG: 608 x 608</option>
+        </select>
+        <label>Input Size</label>
+      </div>
+      <div class="row">
+        <label for="scoreThreshold">Score Threshold:</label>
+        <input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
+      </div>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="onDecreaseThreshold()"
+      >
+        <i class="material-icons left">-</i>
+      </button>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="onIncreaseThreshold()"
+      >
+        <i class="material-icons left">+</i>
+      </button>
+    </div>
+    <div class="row side-by-side">
+      <div class="row">
+        <label for="maxDistance">Max Descriptor Distance:</label>
+        <input disabled value="0.6" id="maxDistance" type="text" class="bold">
+      </div>
+      <button
+        class="waves-effect waves-light btn button-sm"
+        onclick="onDecreaseMaxDistance()"
+      >
+        <i class="material-icons left">-</i>
+      </button>
+      <button
+        class="waves-effect waves-light btn button-sm"
+        onclick="onIncreaseMaxDistance()"
+      >
+        <i class="material-icons left">+</i>
+      </button>
+    </div>
+  </div>
+
+  <script>
+    let maxDistance = 0.6
+    let useBatchProcessing = false
+    let trainDescriptorsByClass = []
+    let scoreThreshold = 0.5
+    let sizeType = 'lg'
+
+    function onIncreaseThreshold() {
+      scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
+      $('#scoreThreshold').val(scoreThreshold)
+      updateResults()
+    }
+
+    function onDecreaseThreshold() {
+      scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
+      $('#scoreThreshold').val(scoreThreshold)
+      updateResults()
+    }
+
+    function onSizeTypeChanged(e, c) {
+      sizeType = e.target.value
+      $('#sizeType').val(sizeType)
+      updateResults()
+    }
+
+    function onChangeUseBatchProcessing(e) {
+      useBatchProcessing = $(e.target).prop('checked')
+    }
+
+    function onIncreaseMaxDistance() {
+      maxDistance = Math.min(faceapi.round(maxDistance + 0.1), 1.0)
+      $('#maxDistance').val(maxDistance)
+      updateResults()
+    }
+
+    function onDecreaseMaxDistance() {
+      maxDistance = Math.max(faceapi.round(maxDistance - 0.1), 0.1)
+      $('#maxDistance').val(maxDistance)
+      updateResults()
+    }
+
+    async function loadImageFromUrl(url) {
+      const img = await requestExternalImage($('#imgUrlInput').val())
+      $('#inputImg').get(0).src = img.src
+      updateResults()
+    }
+
+    async function updateResults() {
+      const inputImgEl = $('#inputImg').get(0)
+      const { width, height } = inputImgEl
+      const canvas = $('#overlay').get(0)
+      canvas.width = width
+      canvas.height = height
+
+      const forwardParams = {
+        inputSize: sizeType,
+        scoreThreshold
+      }
+
+      const fullFaceDescriptions = (await faceapi.allFacesTinyYolov2(inputImgEl, forwardParams, useBatchProcessing))
+        .map(fd => fd.forSize(width, height))
+
+      fullFaceDescriptions.forEach(({ detection, descriptor }) => {
+        faceapi.drawDetection('overlay', [detection], { withScore: false })
+        const bestMatch = getBestMatch(trainDescriptorsByClass, descriptor)
+        const text = `${bestMatch.distance < maxDistance ? bestMatch.className : 'unkown'} (${bestMatch.distance})`
+        const { x, y, height: boxHeight } = detection.getBox()
+        faceapi.drawText(
+          canvas.getContext('2d'),
+          x,
+          y + boxHeight,
+          text,
+          Object.assign(faceapi.getDefaultDrawOptions(), { color: 'red', fontSize: 16 })
+        )
+      })
+    }
+
+    async function onSelectionChanged(uri) {
+      const imgBuf = await fetchImage(uri)
+      $(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
+      updateResults()
+    }
+
+    async function run() {
+      await faceapi.loadTinyYolov2Model('/')
+      await faceapi.loadFaceLandmarkModel('/')
+      await faceapi.loadFaceRecognitionModel('/')
+      trainDescriptorsByClass = await initTrainDescriptorsByClass(faceapi.recognitionNet, 1)
+      $('#loader').hide()
+      onSelectionChanged($('#selectList select').val())
+    }
+
+    $(document).ready(function() {
+      renderNavBar('#navbar', 'tiny_yolov2_face_recognition')
+      renderImageSelectList(
+        '#selectList',
+        async (uri) => {
+          await onSelectionChanged(uri)
+        },
+        'bbt1.jpg'
+      )
+
+      const sizeTypeSelect = $('#sizeType')
+      sizeTypeSelect.val(sizeType)
+      sizeTypeSelect.on('change', onSizeTypeChanged)
+      sizeTypeSelect.material_select()
+      run()
+    })
+  </script>
+</body>
+</html>
\ No newline at end of file
--- a/karma.conf.js
+++ b/karma.conf.js
@@ -4,7 +4,8 @@ const dataFiles = [
  'test/data/*.json',
  'test/media/*.mp4',
  'weights/**/*',
-  'weights_uncompressed/**/*'
+  'weights_uncompressed/**/*',
+  'weights_unused/**/*'
 ].map(pattern => ({
  pattern,
  watched: false,

--- a/proto/ssd_mobilenet_face_optimized_v2.pbtxt
+++ b/proto/ssd_mobilenet_face_optimized_v2.pbtxt
--- a/src/BoundingBox.ts
+++ b/src/BoundingBox.ts
 import { Rect } from './Rect';
+import { Dimensions } from './types';
+import { isDimensions } from './utils';

 export class BoundingBox {
  constructor(
@@ -33,6 +35,10 @@ export class BoundingBox {
    return this.bottom - this.top
  }

+  public get area() : number {
+    return this.width * this.height
+  }
+
  public toSquare(): BoundingBox {
    let { left, top, right, bottom } = this

@@ -100,6 +106,12 @@ export class BoundingBox {
    ).toSquare().round()
  }

+  public rescale(s: Dimensions | number) {
+    const scaleX = isDimensions(s) ? (s as Dimensions).width : s as number
+    const scaleY = isDimensions(s) ? (s as Dimensions).height : s as number
+    return new BoundingBox(this.left * scaleX, this.top * scaleY, this.right * scaleX, this.bottom * scaleY)
+  }
+
  public toRect(): Rect {
    return new Rect(this.left, this.top, this.width, this.height)
  }

--- a/src/NetInput.ts
+++ b/src/NetInput.ts
@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
 import { isTensor3D, isTensor4D } from './commons/isTensor';
 import { padToSquare } from './padToSquare';
 import { Point } from './Point';
-import { TResolvedNetInput } from './types';
+import { TResolvedNetInput, Dimensions } from './types';
 import { createCanvasFromMedia } from './utils';

 export class NetInput {
@@ -14,6 +14,7 @@ export class NetInput {

  private _inputDimensions: number[][] = []
  private _paddings: Point[] = []
+  private _inputSize: number = 0

  constructor(
    inputs: tf.Tensor4D | Array<TResolvedNetInput>,
@@ -81,6 +82,22 @@ export class NetInput {
    return this._paddings
  }

+  public get inputSize(): number {
+    return this._inputSize
+  }
+
+  public get relativePaddings(): Point[] {
+    return Array(this.inputs.length).fill(0).map(
+      (_, batchIdx) => this.getRelativePaddings(batchIdx)
+    )
+  }
+
+  public get reshapedInputDimensions(): Dimensions[] {
+    return Array(this.inputs.length).fill(0).map(
+      (_, batchIdx) => this.getReshapedInputDimensions(batchIdx)
+    )
+  }
+
  public getInputDimensions(batchIdx: number): number[] {
    return this._inputDimensions[batchIdx]
  }
@@ -97,8 +114,26 @@ export class NetInput {
    return this._paddings[batchIdx]
  }

+  public getRelativePaddings(batchIdx: number): Point {
+    return new Point(
+      (this.getPaddings(batchIdx).x + this.getInputWidth(batchIdx)) / this.getInputWidth(batchIdx),
+      (this.getPaddings(batchIdx).y + this.getInputHeight(batchIdx)) / this.getInputHeight(batchIdx)
+    )
+  }
+
+  public getReshapedInputDimensions(batchIdx: number): Dimensions {
+    const [h, w] = [this.getInputHeight(batchIdx), this.getInputWidth(batchIdx)]
+    const f = this.inputSize / Math.max(h, w)
+    return {
+      height: Math.floor(h * f),
+      width: Math.floor(w * f)
+    }
+  }
+
  public toBatchTensor(inputSize: number, isCenterInputs: boolean = true): tf.Tensor4D {

+    this._inputSize = inputSize
+
    return tf.tidy(() => {

      const inputTensors = this._inputs.map((inputTensor: tf.Tensor3D) => {
@@ -131,7 +166,7 @@ export class NetInput {
  }

  /**
-   *  By setting the isManaged flag, all newly created tensors will be automatically
+   *  By setting the isManaged flag, all newly created tensors will be
   *  automatically disposed after the batch tensor has been created
   */
  public managed() {

--- a/src/Rect.ts
+++ b/src/Rect.ts
+import { BoundingBox } from './BoundingBox';
 export interface IRect {
  x: number
  y: number
@@ -54,6 +55,10 @@ export class Rect implements IRect {
    )
  }

+  public toBoundingBox(): BoundingBox {
+    return new BoundingBox(this.x, this.y, this.x + this.width, this.y + this.height)
+  }
+
  public clipAtImageBorders(imgWidth: number, imgHeight: number): Rect {
    const { x, y, right, bottom } = this
    const clippedX = Math.max(x, 0)

--- a/src/allFacesFactory.ts
+++ b/src/allFacesFactory.ts
+import { TinyYolov2 } from '.';
 import { extractFaceTensors } from './extractFaceTensors';
+import { FaceDetection } from './FaceDetection';
 import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
 import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
 import { FaceLandmarks68 } from './faceLandmarkNet/FaceLandmarks68';
@@ -7,6 +9,7 @@ import { FullFaceDescription } from './FullFaceDescription';
 import { Mtcnn } from './mtcnn/Mtcnn';
 import { MtcnnForwardParams } from './mtcnn/types';
 import { Rect } from './Rect';
+import { TinyYolov2ForwardParams } from './tinyYolov2/types';
 import { TNetInput } from './types';

 function computeDescriptorsFactory(
@@ -27,8 +30,8 @@ function computeDescriptorsFactory(
  }
 }

-export function allFacesFactory(
-  detectionNet: FaceDetectionNet,
+function allFacesFactory(
+  detectFaces: (input: TNetInput) => Promise<FaceDetection[]>,
  landmarkNet: FaceLandmarkNet,
  recognitionNet: FaceRecognitionNet
 ) {
@@ -36,11 +39,10 @@ export function allFacesFactory(

  return async function(
    input: TNetInput,
-    minConfidence: number = 0.8,
    useBatchProcessing: boolean = false
  ): Promise<FullFaceDescription[]> {

-    const detections = await detectionNet.locateFaces(input, minConfidence)
+    const detections = await detectFaces(input)
    const faceTensors = await extractFaceTensors(input, detections)

    const faceLandmarksByFace = useBatchProcessing
@@ -68,6 +70,38 @@ export function allFacesFactory(
  }
 }

+export function allFacesSsdMobilenetv1Factory(
+  ssdMobilenetv1: FaceDetectionNet,
+  landmarkNet: FaceLandmarkNet,
+  recognitionNet: FaceRecognitionNet
+) {
+  return async function(
+    input: TNetInput,
+    minConfidence: number = 0.8,
+    useBatchProcessing: boolean = false
+  ): Promise<FullFaceDescription[]> {
+    const detectFaces = (input: TNetInput) => ssdMobilenetv1.locateFaces(input, minConfidence)
+    const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
+    return allFaces(input, useBatchProcessing)
+  }
+}
+
+export function allFacesTinyYolov2Factory(
+  tinyYolov2: TinyYolov2,
+  landmarkNet: FaceLandmarkNet,
+  recognitionNet: FaceRecognitionNet
+) {
+  return async function(
+    input: TNetInput,
+    forwardParams: TinyYolov2ForwardParams = {},
+    useBatchProcessing: boolean = false
+  ): Promise<FullFaceDescription[]> {
+    const detectFaces = (input: TNetInput) => tinyYolov2.locateFaces(input, forwardParams)
+    const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
+    return allFaces(input, useBatchProcessing)
+  }
+}
+
 export function allFacesMtcnnFactory(
  mtcnn: Mtcnn,
  recognitionNet: FaceRecognitionNet

--- a/src/commons/nonMaxSuppression.ts
+++ b/src/commons/nonMaxSuppression.ts
 import { BoundingBox } from '../BoundingBox';
+import { iou } from '../iou';

 export function nonMaxSuppression(
  boxes: BoundingBox[],
@@ -7,10 +8,6 @@ export function nonMaxSuppression(
  isIOU: boolean = true
 ): number[] {

-  const areas = boxes.map(
-    box => (box.width + 1) * (box.height + 1)
-  )
-
  let indicesSortedByScore = scores
    .map((score, boxIndex) => ({ score, boxIndex }))
    .sort((c1, c2) => c1.score - c2.score)
@@ -31,15 +28,7 @@ export function nonMaxSuppression(
      const currBox = boxes[curr]
      const idxBox = boxes[idx]

-      const width = Math.max(0.0, Math.min(currBox.right, idxBox.right) - Math.max(currBox.left, idxBox.left) + 1)
-      const height = Math.max(0.0, Math.min(currBox.bottom, idxBox.bottom) - Math.max(currBox.top, idxBox.top) + 1)
-      const interSection = width * height
-
-      const out = isIOU
-        ? interSection / (areas[curr] + areas[idx] - interSection)
-        : interSection / Math.min(areas[curr], areas[idx])
-
-      outputs.push(out)
+      outputs.push(iou(currBox, idxBox, isIOU))
    }

    indicesSortedByScore = indicesSortedByScore.filter(

--- a/src/commons/normalize.ts
+++ b/src/commons/normalize.ts
+import * as tf from '@tensorflow/tfjs-core';
+
+export function normalize(x: tf.Tensor4D, meanRgb: number[]): tf.Tensor4D {
+  return tf.tidy(() => {
+    const [r, g, b] = meanRgb
+    const avg_r = tf.fill([...x.shape.slice(0, 3), 1], r)
+    const avg_g = tf.fill([...x.shape.slice(0, 3), 1], g)
+    const avg_b = tf.fill([...x.shape.slice(0, 3), 1], b)
+    const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
+
+    return tf.sub(x, avg_rgb)
+  })
+}
\ No newline at end of file
--- a/src/faceDetectionNet/FaceDetectionNet.ts
+++ b/src/faceDetectionNet/FaceDetectionNet.ts
@@ -81,19 +81,18 @@ export class FaceDetectionNet extends NeuralNetwork<NetParams> {
      minConfidence
    )

-    const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0)
-    const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
+    const paddings = netInput.getRelativePaddings(0)

    const results = indices
      .map(idx => {
        const [top, bottom] = [
          Math.max(0, boxes.get(idx, 0)),
          Math.min(1.0, boxes.get(idx, 2))
-        ].map(val => val * paddedHeightRelative)
+        ].map(val => val * paddings.y)
        const [left, right] = [
          Math.max(0, boxes.get(idx, 1)),
          Math.min(1.0, boxes.get(idx, 3))
-        ].map(val => val * paddedWidthRelative)
+        ].map(val => val * paddings.x)
        return new FaceDetection(
          scoresData[idx],
          new Rect(

--- a/src/faceRecognitionNet/FaceRecognitionNet.ts
+++ b/src/faceRecognitionNet/FaceRecognitionNet.ts
 import * as tf from '@tensorflow/tfjs-core';

 import { NeuralNetwork } from '../commons/NeuralNetwork';
+import { normalize } from '../commons/normalize';
 import { NetInput } from '../NetInput';
 import { toNetInput } from '../toNetInput';
 import { TNetInput } from '../types';
 import { convDown } from './convLayer';
 import { extractParams } from './extractParams';
 import { loadQuantizedParams } from './loadQuantizedParams';
-import { normalize } from './normalize';
 import { residual, residualDown } from './residualLayer';
 import { NetParams } from './types';

@@ -28,7 +28,8 @@ export class FaceRecognitionNet extends NeuralNetwork<NetParams> {
    return tf.tidy(() => {
      const batchTensor = input.toBatchTensor(150, true)

-      const normalized = normalize(batchTensor)
+      const meanRgb = [122.782, 117.001, 104.298]
+      const normalized = normalize(batchTensor, meanRgb).div(tf.scalar(256)) as tf.Tensor4D

      let out = convDown(normalized, params.conv32_down)
      out = tf.maxPool(out, 3, 2, 'valid')

--- a/src/faceRecognitionNet/normalize.ts
+++ b/src/faceRecognitionNet/normalize.ts
-import * as tf from '@tensorflow/tfjs-core';
-
-export function normalize(x: tf.Tensor4D): tf.Tensor4D {
-  return tf.tidy(() => {
-    const avg_r = tf.fill([...x.shape.slice(0, 3), 1], 122.782);
-    const avg_g = tf.fill([...x.shape.slice(0, 3), 1], 117.001);
-    const avg_b = tf.fill([...x.shape.slice(0, 3), 1], 104.298);
-    const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
-
-    return tf.div(tf.sub(x, avg_rgb), tf.scalar(256))
-  })
-}
\ No newline at end of file
--- a/src/globalApi.ts
+++ b/src/globalApi.ts
 import * as tf from '@tensorflow/tfjs-core';

-import { allFacesFactory, allFacesMtcnnFactory } from './allFacesFactory';
+import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from './allFacesFactory';
 import { FaceDetection } from './FaceDetection';
 import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
 import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
@@ -21,15 +21,15 @@ export const recognitionNet = new FaceRecognitionNet()
 // nets need more specific names, to avoid ambiguity in future
 // when alternative net implementations are provided
 export const nets = {
-  ssdMobilenet: detectionNet,
+  ssdMobilenetv1: detectionNet,
  faceLandmark68Net: landmarkNet,
  faceRecognitionNet: recognitionNet,
  mtcnn: new Mtcnn(),
  tinyYolov2: new TinyYolov2()
 }

-export function loadFaceDetectionModel(url: string) {
-  return nets.ssdMobilenet.load(url)
+export function loadSsdMobilenetv1Model(url: string) {
+  return nets.ssdMobilenetv1.load(url)
 }

 export function loadFaceLandmarkModel(url: string) {
@@ -48,9 +48,13 @@ export function loadTinyYolov2Model(url: string) {
  return nets.tinyYolov2.load(url)
 }

+export function loadFaceDetectionModel(url: string) {
+  return loadSsdMobilenetv1Model(url)
+}
+
 export function loadModels(url: string) {
  return Promise.all([
-    loadFaceDetectionModel(url),
+    loadSsdMobilenetv1Model(url),
    loadFaceLandmarkModel(url),
    loadFaceRecognitionModel(url),
    loadMtcnnModel(url),
@@ -63,7 +67,7 @@ export function locateFaces(
  minConfidence?: number,
  maxResults?: number
 ): Promise<FaceDetection[]> {
-  return nets.ssdMobilenet.locateFaces(input, minConfidence, maxResults)
+  return nets.ssdMobilenetv1.locateFaces(input, minConfidence, maxResults)
 }

 export function detectLandmarks(
@@ -92,14 +96,26 @@ export function tinyYolov2(
  return nets.tinyYolov2.locateFaces(input, forwardParams)
 }

-export type allFacesFunction = (
+export type allFacesSsdMobilenetv1Function = (
  input: tf.Tensor | NetInput | TNetInput,
  minConfidence?: number,
  useBatchProcessing?: boolean
 ) => Promise<FullFaceDescription[]>

-export const allFaces: allFacesFunction = allFacesFactory(
-  nets.ssdMobilenet,
+export const allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function = allFacesSsdMobilenetv1Factory(
+  nets.ssdMobilenetv1,
+  nets.faceLandmark68Net,
+  nets.faceRecognitionNet
+)
+
+export type allFacesTinyYolov2Function = (
+  input: tf.Tensor | NetInput | TNetInput,
+  forwardParams?: TinyYolov2ForwardParams,
+  useBatchProcessing?: boolean
+) => Promise<FullFaceDescription[]>
+
+export const allFacesTinyYolov2: allFacesTinyYolov2Function = allFacesTinyYolov2Factory(
+  nets.tinyYolov2,
  nets.faceLandmark68Net,
  nets.faceRecognitionNet
 )
@@ -114,3 +130,5 @@ export const allFacesMtcnn: allFacesMtcnnFunction = allFacesMtcnnFactory(
  nets.mtcnn,
  nets.faceRecognitionNet
 )
+
+export const allFaces = allFacesSsdMobilenetv1
\ No newline at end of file
--- a/src/index.ts
+++ b/src/index.ts
@@ -5,6 +5,7 @@ export {
 }


+export * from './BoundingBox';
 export * from './FaceDetection';
 export * from './FullFaceDescription';
 export * from './NetInput';
@@ -19,8 +20,11 @@ export * from './faceDetectionNet';
 export * from './faceLandmarkNet';
 export * from './faceRecognitionNet';
 export * from './globalApi';
+export * from './iou';
 export * from './mtcnn';
 export * from './padToSquare';
 export * from './tinyYolov2';
 export * from './toNetInput';
-export * from './utils'
\ No newline at end of file
+export * from './utils';
+
+export * from './types';
\ No newline at end of file
--- a/src/iou.ts
+++ b/src/iou.ts
+import { BoundingBox } from './BoundingBox';
+
+export function iou(box1: BoundingBox, box2: BoundingBox, isIOU: boolean = true) {
+  const width = Math.max(0.0, Math.min(box1.right, box2.right) - Math.max(box1.left, box2.left) + 1)
+  const height = Math.max(0.0, Math.min(box1.bottom, box2.bottom) - Math.max(box1.top, box2.top) + 1)
+  const interSection = width * height
+
+  return isIOU
+    ? interSection / (box1.area + box2.area - interSection)
+    : interSection / Math.min(box1.area, box2.area)
+}
\ No newline at end of file
--- a/src/tinyYolov2/TinyYolov2.ts
+++ b/src/tinyYolov2/TinyYolov2.ts
@@ -4,21 +4,37 @@ import { BoundingBox } from '../BoundingBox';
 import { convLayer } from '../commons/convLayer';
 import { NeuralNetwork } from '../commons/NeuralNetwork';
 import { nonMaxSuppression } from '../commons/nonMaxSuppression';
+import { normalize } from '../commons/normalize';
 import { FaceDetection } from '../FaceDetection';
 import { NetInput } from '../NetInput';
+import { Point } from '../Point';
 import { toNetInput } from '../toNetInput';
 import { TNetInput } from '../types';
-import { BOX_ANCHORS, INPUT_SIZES, IOU_THRESHOLD, NUM_BOXES } from './config';
+import { sigmoid } from '../utils';
+import { BOX_ANCHORS, BOX_ANCHORS_SEPARABLE, INPUT_SIZES, IOU_THRESHOLD, MEAN_RGB, NUM_BOXES } from './config';
 import { convWithBatchNorm } from './convWithBatchNorm';
 import { extractParams } from './extractParams';
 import { getDefaultParams } from './getDefaultParams';
 import { loadQuantizedParams } from './loadQuantizedParams';
-import { NetParams, TinyYolov2ForwardParams } from './types';
+import { NetParams, PostProcessingParams, TinyYolov2ForwardParams } from './types';

 export class TinyYolov2 extends NeuralNetwork<NetParams> {

-  constructor() {
+  private _withSeparableConvs: boolean
+  private _anchors: Point[]
+
+  constructor(withSeparableConvs: boolean = true) {
    super('TinyYolov2')
+    this._withSeparableConvs = withSeparableConvs
+    this._anchors = withSeparableConvs ? BOX_ANCHORS_SEPARABLE : BOX_ANCHORS
+  }
+
+  public get withSeparableConvs(): boolean {
+    return this._withSeparableConvs
+  }
+
+  public get anchors(): Point[] {
+    return this._anchors
  }

  public forwardInput(input: NetInput, inputSize: number): tf.Tensor4D {
@@ -30,7 +46,12 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
    }

    const out = tf.tidy(() => {
-      const batchTensor = input.toBatchTensor(inputSize, false).div(tf.scalar(255)).toFloat() as tf.Tensor4D
+
+      let batchTensor = input.toBatchTensor(inputSize, false)
+      batchTensor = this.withSeparableConvs
+        ? normalize(batchTensor, MEAN_RGB)
+        : batchTensor
+      batchTensor = batchTensor.div(tf.scalar(256)) as tf.Tensor4D

      let out = convWithBatchNorm(batchTensor, params.conv0)
      out = tf.maxPool(out, [2, 2], [2, 2], 'same')
@@ -67,44 +88,78 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
      : _inputSize

    if (typeof inputSize !== 'number') {
-      throw new Error(`TinyYolov2 - unkown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`)
+      throw new Error(`TinyYolov2 - unknown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`)
    }

    const netInput = await toNetInput(input, true)
    const out = await this.forwardInput(netInput, inputSize)
-    const numCells = out.shape[1]
+    const out0 = tf.tidy(() => tf.unstack(out)[0].expandDims()) as tf.Tensor4D
+
+    const inputDimensions = {
+      width: netInput.getInputWidth(0),
+      height: netInput.getInputHeight(0)
+    }
+
+    const paddings = netInput.getRelativePaddings(0)
+
+    const results = this.postProcess(out0, { scoreThreshold, paddings })
+    const boxes = results.map(res => res.box)
+    const scores = results.map(res => res.score)

-    const [boxesTensor, scoresTensor] = tf.tidy(() => {
-      const reshaped = out.reshape([numCells, numCells, NUM_BOXES, 6])
    out.dispose()
+    out0.dispose()
+
+    const indices = nonMaxSuppression(
+      boxes.map(box => box.rescale(inputSize)),
+      scores,
+      IOU_THRESHOLD,
+      true
+    )
+
+    const detections = indices.map(idx =>
+      new FaceDetection(
+        scores[idx],
+        boxes[idx].toRect(),
+        inputDimensions
+      )
+    )
+
+    return detections
+  }
+
+  public postProcess(outputTensor: tf.Tensor4D, { scoreThreshold, paddings }: PostProcessingParams) {
+
+    const numCells = outputTensor.shape[1]
+
+    const [boxesTensor, scoresTensor] = tf.tidy(() => {
+      const reshaped = outputTensor.reshape([numCells, numCells, NUM_BOXES, this.withSeparableConvs ? 5 : 6])

      const boxes = reshaped.slice([0, 0, 0, 0], [numCells, numCells, NUM_BOXES, 4])
      const scores = reshaped.slice([0, 0, 0, 4], [numCells, numCells, NUM_BOXES, 1])
      return [boxes, scores]
    })

-    const expit = (x: number): number => 1 / (1 + Math.exp(-x))
-
-    const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0)
-    const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
-
-    const boxes: BoundingBox[] = []
-    const scores: number[] = []
+    const results = []

    for (let row = 0; row < numCells; row ++) {
      for (let col = 0; col < numCells; col ++) {
-        for (let box = 0; box < NUM_BOXES; box ++) {
-          const score = expit(scoresTensor.get(row, col, box, 0))
-          if (score > scoreThreshold) {
-            const ctX = ((col + expit(boxesTensor.get(row, col, box, 0))) / numCells) * paddedWidthRelative
-            const ctY = ((row + expit(boxesTensor.get(row, col, box, 1))) / numCells) * paddedHeightRelative
-            const width = ((Math.exp(boxesTensor.get(row, col, box, 2)) * BOX_ANCHORS[box].x) / numCells) * paddedWidthRelative
-            const height = ((Math.exp(boxesTensor.get(row, col, box, 3)) * BOX_ANCHORS[box].y) / numCells) * paddedHeightRelative
+        for (let anchor = 0; anchor < NUM_BOXES; anchor ++) {
+          const score = sigmoid(scoresTensor.get(row, col, anchor, 0))
+          if (!scoreThreshold || score > scoreThreshold) {
+            const ctX = ((col + sigmoid(boxesTensor.get(row, col, anchor, 0))) / numCells) * paddings.x
+            const ctY = ((row + sigmoid(boxesTensor.get(row, col, anchor, 1))) / numCells) * paddings.y
+            const width = ((Math.exp(boxesTensor.get(row, col, anchor, 2)) * this.anchors[anchor].x) / numCells) * paddings.x
+            const height = ((Math.exp(boxesTensor.get(row, col, anchor, 3)) * this.anchors[anchor].y) / numCells) * paddings.y

            const x = (ctX - (width / 2))
            const y = (ctY - (height / 2))
-            boxes.push(new BoundingBox(x, y, x + width, y + height))
-            scores.push(score)
+            results.push({
+              box: new BoundingBox(x, y, x + width, y + height),
+              score,
+              row,
+              col,
+              anchor
+            })
          }
        }
      }
@@ -113,34 +168,14 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
    boxesTensor.dispose()
    scoresTensor.dispose()

-    const indices = nonMaxSuppression(
-      boxes.map(box => new BoundingBox(
-        box.left * inputSize,
-        box.top * inputSize,
-        box.right * inputSize,
-        box.bottom * inputSize
-      )),
-      scores,
-      IOU_THRESHOLD,
-      true
-    )
-
-    const detections = indices.map(idx =>
-      new FaceDetection(
-        scores[idx],
-        boxes[idx].toRect(),
-        { width: netInput.getInputWidth(0), height: netInput.getInputHeight(0) }
-      )
-    )
-
-    return detections
+    return results
  }

  protected loadQuantizedParams(uri: string | undefined) {
-    return loadQuantizedParams(uri)
+    return loadQuantizedParams(uri, this.withSeparableConvs)
  }

  protected extractParams(weights: Float32Array) {
-    return extractParams(weights)
+    return extractParams(weights, this.withSeparableConvs)
  }
 }
\ No newline at end of file
--- a/src/tinyYolov2/config.ts
+++ b/src/tinyYolov2/config.ts
@@ -11,3 +11,13 @@ export const BOX_ANCHORS = [
  new Point(10.246, 4.59428),
  new Point(12.6868, 11.8741)
 ]
+
+export const BOX_ANCHORS_SEPARABLE = [
+  new Point(1.603231, 2.094468),
+  new Point(6.041143, 7.080126),
+  new Point(2.882459, 3.518061),
+  new Point(4.266906, 5.178857),
+  new Point(9.041765, 10.66308)
+]
+
+export const MEAN_RGB = [117.001, 114.697, 97.404]
\ No newline at end of file
--- a/src/tinyYolov2/convWithBatchNorm.ts
+++ b/src/tinyYolov2/convWithBatchNorm.ts
 import * as tf from '@tensorflow/tfjs-core';

 import { leaky } from './leaky';
-import { ConvWithBatchNorm } from './types';
+import { ConvWithBatchNorm, SeparableConvParams } from './types';

-export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm): tf.Tensor4D  {
+export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm | SeparableConvParams): tf.Tensor4D  {
  return tf.tidy(() => {
    let out = tf.pad(x, [[0, 0], [1, 1], [1, 1], [0, 0]]) as tf.Tensor4D
+
+    if (params instanceof SeparableConvParams) {
+      out = tf.separableConv2d(out, params.depthwise_filter, params.pointwise_filter, [1, 1], 'valid')
+      out = tf.add(out, params.bias)
+    } else {
      out = tf.conv2d(out, params.conv.filters, [1, 1], 'valid')
      out = tf.sub(out, params.bn.sub)
      out = tf.mul(out, params.bn.truediv)
      out = tf.add(out, params.conv.bias)
+    }
+
    return leaky(out)
  })
 }
\ No newline at end of file
--- a/src/tinyYolov2/extractParams.ts
+++ b/src/tinyYolov2/extractParams.ts
@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
 import { extractConvParamsFactory } from '../commons/extractConvParamsFactory';
 import { extractWeightsFactory } from '../commons/extractWeightsFactory';
 import { ExtractWeightsFunction, ParamMapping } from '../commons/types';
-import { BatchNorm, ConvWithBatchNorm, NetParams } from './types';
+import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';

 function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings: ParamMapping[]) {

@@ -30,14 +30,33 @@ function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings
    return { conv, bn }
  }

+  function extractSeparableConvParams(channelsIn: number, channelsOut: number, mappedPrefix: string): SeparableConvParams {
+    const depthwise_filter = tf.tensor4d(extractWeights(3 * 3 * channelsIn), [3, 3, channelsIn, 1])
+    const pointwise_filter = tf.tensor4d(extractWeights(channelsIn * channelsOut), [1, 1, channelsIn, channelsOut])
+    const bias = tf.tensor1d(extractWeights(channelsOut))
+
+    paramMappings.push(
+      { paramPath: `${mappedPrefix}/depthwise_filter` },
+      { paramPath: `${mappedPrefix}/pointwise_filter` },
+      { paramPath: `${mappedPrefix}/bias` }
+    )
+
+    return new SeparableConvParams(
+      depthwise_filter,
+      pointwise_filter,
+      bias
+    )
+  }
+
  return {
    extractConvParams,
-    extractConvWithBatchNormParams
+    extractConvWithBatchNormParams,
+    extractSeparableConvParams
  }

 }

-export function extractParams(weights: Float32Array): { params: NetParams, paramMappings: ParamMapping[] } {
+export function extractParams(weights: Float32Array, withSeparableConvs: boolean): { params: NetParams, paramMappings: ParamMapping[] } {

  const {
    extractWeights,
@@ -48,18 +67,22 @@ export function extractParams(weights: Float32Array): { params: NetParams, param

  const {
    extractConvParams,
-    extractConvWithBatchNormParams
+    extractConvWithBatchNormParams,
+    extractSeparableConvParams
  } = extractorsFactory(extractWeights, paramMappings)

-  const conv0 = extractConvWithBatchNormParams(3, 16, 'conv0')
-  const conv1 = extractConvWithBatchNormParams(16, 32, 'conv1')
-  const conv2 = extractConvWithBatchNormParams(32, 64, 'conv2')
-  const conv3 = extractConvWithBatchNormParams(64, 128, 'conv3')
-  const conv4 = extractConvWithBatchNormParams(128, 256, 'conv4')
-  const conv5 = extractConvWithBatchNormParams(256, 512, 'conv5')
-  const conv6 = extractConvWithBatchNormParams(512, 1024, 'conv6')
-  const conv7 = extractConvWithBatchNormParams(1024, 1024, 'conv7')
-  const conv8 = extractConvParams(1024, 30, 1, 'conv8')
+  const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams
+  const numAnchorEncodings = withSeparableConvs ? 5 : 6
+
+  const conv0 = extractConvFn(3, 16, 'conv0',)
+  const conv1 = extractConvFn(16, 32, 'conv1')
+  const conv2 = extractConvFn(32, 64, 'conv2')
+  const conv3 = extractConvFn(64, 128, 'conv3')
+  const conv4 = extractConvFn(128, 256, 'conv4')
+  const conv5 = extractConvFn(256, 512, 'conv5')
+  const conv6 = extractConvFn(512, 1024, 'conv6')
+  const conv7 = extractConvFn(1024, 1024, 'conv7')
+  const conv8 = extractConvParams(1024, 5 * numAnchorEncodings, 1, 'conv8')

  if (getRemainingWeights().length !== 0) {
    throw new Error(`weights remaing after extract: ${getRemainingWeights().length}`)

--- a/src/tinyYolov2/getDefaultParams.ts
+++ b/src/tinyYolov2/getDefaultParams.ts
@@ -4,7 +4,7 @@ export function getDefaultParams(params: TinyYolov2ForwardParams) {
  return Object.assign(
    {},
    {
-      sizeType: SizeType.MD,
+      inputSize: SizeType.MD,
      scoreThreshold: 0.5
    },
    params

--- a/src/tinyYolov2/index.ts
+++ b/src/tinyYolov2/index.ts
@@ -2,8 +2,8 @@ import { TinyYolov2 } from './TinyYolov2';

 export * from './TinyYolov2';

-export function createTinyYolov2(weights: Float32Array) {
-  const net = new TinyYolov2()
+export function createTinyYolov2(weights: Float32Array, withSeparableConvs: boolean = true) {
+  const net = new TinyYolov2(withSeparableConvs)
  net.extractWeights(weights)
  return net
 }
\ No newline at end of file
--- a/src/tinyYolov2/leaky.ts
+++ b/src/tinyYolov2/leaky.ts
@@ -2,6 +2,8 @@ import * as tf from '@tensorflow/tfjs-core';

 export function leaky(x: tf.Tensor4D): tf.Tensor4D {
  return tf.tidy(() => {
-    return tf.maximum(x, tf.mul(x, tf.scalar(0.10000000149011612)))
+    const min = tf.mul(x, tf.scalar(0.10000000149011612))
+    return tf.add(tf.relu(tf.sub(x, min)), min)
+    //return tf.maximum(x, min)
  })
 }
\ No newline at end of file
--- a/src/tinyYolov2/loadQuantizedParams.ts
+++ b/src/tinyYolov2/loadQuantizedParams.ts
@@ -4,9 +4,10 @@ import { disposeUnusedWeightTensors } from '../commons/disposeUnusedWeightTensor
 import { extractWeightEntryFactory } from '../commons/extractWeightEntryFactory';
 import { loadWeightMap } from '../commons/loadWeightMap';
 import { ConvParams, ParamMapping } from '../commons/types';
-import { BatchNorm, ConvWithBatchNorm, NetParams } from './types';
+import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';

 const DEFAULT_MODEL_NAME = 'tiny_yolov2_model'
+const DEFAULT_MODEL_NAME_SEPARABLE_CONV = 'tiny_yolov2_separable_conv_model'

 function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {

@@ -30,35 +31,51 @@ function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {
    return { conv, bn }
  }

+  function extractSeparableConvParams(prefix: string): SeparableConvParams {
+    const depthwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/depthwise_filter`, 4)
+    const pointwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/pointwise_filter`, 4)
+    const bias = extractWeightEntry<tf.Tensor1D>(`${prefix}/bias`, 1)
+
+    return new SeparableConvParams(
+      depthwise_filter,
+      pointwise_filter,
+      bias
+    )
+  }
+
  return {
    extractConvParams,
-    extractConvWithBatchNormParams
+    extractConvWithBatchNormParams,
+    extractSeparableConvParams
  }

 }

 export async function loadQuantizedParams(
-  uri: string | undefined
+  uri: string | undefined,
+  withSeparableConvs: boolean
 ): Promise<{ params: NetParams, paramMappings: ParamMapping[] }> {

-  const weightMap = await loadWeightMap(uri, DEFAULT_MODEL_NAME)
+  const weightMap = await loadWeightMap(uri,  withSeparableConvs ? DEFAULT_MODEL_NAME_SEPARABLE_CONV : DEFAULT_MODEL_NAME)
  const paramMappings: ParamMapping[] = []

  const {
    extractConvParams,
-    extractConvWithBatchNormParams
+    extractConvWithBatchNormParams,
+    extractSeparableConvParams
  } = extractorsFactory(weightMap, paramMappings)

+  const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams

  const params = {
-    conv0: extractConvWithBatchNormParams('conv0'),
-    conv1: extractConvWithBatchNormParams('conv1'),
-    conv2: extractConvWithBatchNormParams('conv2'),
-    conv3: extractConvWithBatchNormParams('conv3'),
-    conv4: extractConvWithBatchNormParams('conv4'),
-    conv5: extractConvWithBatchNormParams('conv5'),
-    conv6: extractConvWithBatchNormParams('conv6'),
-    conv7: extractConvWithBatchNormParams('conv7'),
+    conv0: extractConvFn('conv0'),
+    conv1: extractConvFn('conv1'),
+    conv2: extractConvFn('conv2'),
+    conv3: extractConvFn('conv3'),
+    conv4: extractConvFn('conv4'),
+    conv5: extractConvFn('conv5'),
+    conv6: extractConvFn('conv6'),
+    conv7: extractConvFn('conv7'),
    conv8: extractConvParams('conv8')
  }


--- a/src/tinyYolov2/types.ts
+++ b/src/tinyYolov2/types.ts
 import * as tf from '@tensorflow/tfjs-core';

 import { ConvParams } from '../commons/types';
+import { Point } from '../Point';

 export type BatchNorm = {
  sub: tf.Tensor1D
  truediv: tf.Tensor1D
 }

+export class SeparableConvParams {
+  constructor(
+    public depthwise_filter: tf.Tensor4D,
+    public pointwise_filter: tf.Tensor4D,
+    public bias: tf.Tensor1D
+  ) {}
+}
+
 export type ConvWithBatchNorm = {
  conv: ConvParams
  bn: BatchNorm
 }

 export type NetParams = {
-  conv0: ConvWithBatchNorm
-  conv1: ConvWithBatchNorm
-  conv2: ConvWithBatchNorm
-  conv3: ConvWithBatchNorm
-  conv4: ConvWithBatchNorm
-  conv5: ConvWithBatchNorm
-  conv6: ConvWithBatchNorm
-  conv7: ConvWithBatchNorm
+  conv0: ConvWithBatchNorm | SeparableConvParams
+  conv1: ConvWithBatchNorm | SeparableConvParams
+  conv2: ConvWithBatchNorm | SeparableConvParams
+  conv3: ConvWithBatchNorm | SeparableConvParams
+  conv4: ConvWithBatchNorm | SeparableConvParams
+  conv5: ConvWithBatchNorm | SeparableConvParams
+  conv6: ConvWithBatchNorm | SeparableConvParams
+  conv7: ConvWithBatchNorm | SeparableConvParams
  conv8: ConvParams
 }

@@ -35,3 +44,8 @@ export type TinyYolov2ForwardParams = {
  inputSize?: SizeType | number
  scoreThreshold?: number
 }
+
+export type PostProcessingParams = {
+  scoreThreshold?: number
+  paddings: Point
+}
\ No newline at end of file
--- a/src/utils.ts
+++ b/src/utils.ts
@@ -15,6 +15,14 @@ export function round(num: number) {
  return Math.floor(num * 100) / 100
 }

+export function sigmoid(x: number) {
+  return 1 / (1 + Math.exp(-x))
+}
+
+export function isDimensions(obj: any): boolean {
+  return obj && obj.width && obj.height
+}
+
 export function resolveInput(arg: string | any) {
  if (typeof arg === 'string') {
    return document.getElementById(arg)
@@ -37,14 +45,14 @@ export function awaitMediaLoaded(media: HTMLImageElement | HTMLVideoElement | HT
      if (!e.currentTarget) return
      e.currentTarget.removeEventListener('load', onLoad)
      e.currentTarget.removeEventListener('error', onError)
-      resolve()
+      resolve(e)
    }

    function onError(e: Event) {
      if (!e.currentTarget) return
      e.currentTarget.removeEventListener('load', onLoad)
      e.currentTarget.removeEventListener('error', onError)
-      reject()
+      reject(e)
    }

    media.addEventListener('load', onLoad)

--- a/test/tests/e2e/allFaces.test.ts
+++ b/test/tests/e2e/allFaces.test.ts
@@ -6,7 +6,7 @@ import { toNetInput } from '../../../src';
 import * as tf from '@tensorflow/tfjs-core';
 import { Point } from '../../../src/Point';

-describe('allFaces', () => {
+describe('allFacesSsdMobilenetv1', () => {

  let imgEl: HTMLImageElement
  let facesFaceLandmarkPositions: Point[][]
@@ -19,14 +19,14 @@ describe('allFaces', () => {
    facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
  })

-  describeWithNets('computes full face descriptions', { withAllFaces: true }, ({ allFaces }) => {
+  describeWithNets('computes full face descriptions', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {

    const expectedScores = [0.97, 0.88, 0.83, 0.82, 0.59, 0.52]
    const maxBoxDelta = 5
    const maxLandmarkPointsDelta = 1

    it('scores > 0.8', async () => {
-      const results = await allFaces(imgEl, 0.8)
+      const results = await allFacesSsdMobilenetv1(imgEl, 0.8)

      expect(results.length).toEqual(4)
      results.forEach(({ detection, landmarks, descriptor }, i) => {
@@ -40,7 +40,7 @@ describe('allFaces', () => {
    })

    it('scores > 0.5', async () => {
-      const results = await allFaces(imgEl, 0.5)
+      const results = await allFacesSsdMobilenetv1(imgEl, 0.5)

      expect(results.length).toEqual(6)
      results.forEach(({ detection, landmarks, descriptor }, i) => {
@@ -55,11 +55,11 @@ describe('allFaces', () => {

  })

-  describeWithNets('no memory leaks', { withAllFaces: true }, ({ allFaces }) => {
+  describeWithNets('no memory leaks', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {

    it('single image element', async () => {
      await expectAllTensorsReleased(async () => {
-        await allFaces(imgEl)
+        await allFacesSsdMobilenetv1(imgEl)
      })
    })

@@ -68,7 +68,7 @@ describe('allFaces', () => {

      await expectAllTensorsReleased(async () => {
        const netInput = (new NetInput([tensor])).managed()
-        await allFaces(netInput)
+        await allFacesSsdMobilenetv1(netInput)
      })

      tensor.dispose()
@@ -78,7 +78,7 @@ describe('allFaces', () => {
      const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D

      await expectAllTensorsReleased(async () => {
-        await allFaces(await toNetInput(tensor, true))
+        await allFacesSsdMobilenetv1(await toNetInput(tensor, true))
      })

      tensor.dispose()

--- a/test/tests/e2e/allFacesTinyYolov2.test.ts
+++ b/test/tests/e2e/allFacesTinyYolov2.test.ts
+import * as faceapi from '../../../src';
+import { describeWithNets, expectAllTensorsReleased, expectRectClose, expectPointClose, expectMaxDelta } from '../../utils';
+import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
+import { NetInput } from '../../../src/NetInput';
+import { toNetInput } from '../../../src';
+import * as tf from '@tensorflow/tfjs-core';
+import { Point } from '../../../src/Point';
+import { SizeType } from '../../../src/tinyYolov2/types';
+
+describe('allFacesTinyYolov2', () => {
+
+  let imgEl: HTMLImageElement
+  let facesFaceLandmarkPositions: Point[][]
+  let facesFaceDescriptors: number[][]
+
+  beforeAll(async () => {
+    const img = await (await fetch('base/test/images/faces.jpg')).blob()
+    imgEl = await faceapi.bufferToImage(img)
+    facesFaceLandmarkPositions = await (await fetch('base/test/data/facesFaceLandmarkPositions.json')).json()
+    facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
+  })
+
+  describeWithNets('computes full face descriptions', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
+
+
+    it('SizeType.LG', async () => {
+      const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
+      const maxBoxDelta = 5
+      const maxLandmarkPointsDelta = 10
+      const maxDescriptorDelta = 0.06
+
+      const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.LG })
+      const detectionOrder = [0, 2, 3, 4, 1, 5]
+
+      expect(results.length).toEqual(6)
+      results.forEach(({ detection, landmarks, descriptor }, i) => {
+        expect(detection.getImageWidth()).toEqual(imgEl.width)
+        expect(detection.getImageHeight()).toEqual(imgEl.height)
+        expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
+        expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[i], maxBoxDelta)
+        landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
+        descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
+      })
+    })
+
+    it('SizeType.MD', async () => {
+      const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
+      const maxBoxDelta = 17
+      const maxLandmarkPointsDelta = 16
+      const maxDescriptorDelta = 0.05
+
+      const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.MD })
+      const boxOrder = [5, 1, 4, 3, 2, 0]
+      const detectionOrder = [5, 2, 1, 4, 3, 0]
+
+      expect(results.length).toEqual(6)
+      results.forEach(({ detection, landmarks, descriptor }, i) => {
+        expect(detection.getImageWidth()).toEqual(imgEl.width)
+        expect(detection.getImageHeight()).toEqual(imgEl.height)
+        expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
+        expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
+        landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
+        descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
+      })
+    })
+
+  })
+
+  describeWithNets('no memory leaks', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
+
+    it('single image element', async () => {
+      await expectAllTensorsReleased(async () => {
+        await allFacesTinyYolov2(imgEl)
+      })
+    })
+
+    it('single tf.Tensor3D', async () => {
+      const tensor = tf.fromPixels(imgEl)
+
+      await expectAllTensorsReleased(async () => {
+        const netInput = (new NetInput([tensor])).managed()
+        await allFacesTinyYolov2(netInput)
+      })
+
+      tensor.dispose()
+    })
+
+    it('single batch size 1 tf.Tensor4Ds', async () => {
+      const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D
+
+      await expectAllTensorsReleased(async () => {
+        await allFacesTinyYolov2(await toNetInput(tensor, true))
+      })
+
+      tensor.dispose()
+    })
+
+  })
+
+
+})
\ No newline at end of file
--- a/test/tests/e2e/expectedResults.ts
+++ b/test/tests/e2e/expectedResults.ts
@@ -29,6 +29,15 @@ export const expectedTinyYolov2Boxes = [
  { x: 87, y: 30, width: 92, height: 93 }
 ]

+export const expectedTinyYolov2SeparableConvBoxes = [
+  { x: 42, y: 257, width: 111, height: 121 },
+  { x: 454, y: 175, width: 104, height: 121 },
+  { x: 230, y: 45, width: 94, height: 104 },
+  { x: 574, y: 62, width: 88, height: 113 },
+  { x: 260, y: 233, width: 82, height: 104 },
+  { x: 83, y: 24, width: 85, height: 111 }
+]
+
 export const expectedMtcnnFaceLandmarks = [
  [new Point(117, 58), new Point(156, 63), new Point(141, 86), new Point(109, 98), new Point(147, 104)],
  [new Point(82, 292), new Point(134, 304), new Point(104, 330), new Point(72, 342), new Point(120, 353)],
@@ -38,7 +47,6 @@ export const expectedMtcnnFaceLandmarks = [
  [new Point(489, 224), new Point(534, 223), new Point(507, 250), new Point(493, 271), new Point(530, 270)]
 ]

-
 export function expectMtcnnResults(
  results: { faceDetection: faceapi.FaceDetection, faceLandmarks: faceapi.FaceLandmarks5 }[],
  boxOrder: number[],

--- a/test/tests/e2e/tinyYolov2.test.ts
+++ b/test/tests/e2e/tinyYolov2.test.ts
 import * as faceapi from '../../../src';
 import { SizeType } from '../../../src/tinyYolov2/types';
 import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
-import { expectedTinyYolov2Boxes } from './expectedResults';
-
+import { expectedTinyYolov2Boxes, expectedTinyYolov2SeparableConvBoxes } from './expectedResults';

 describe('tinyYolov2', () => {

@@ -13,7 +12,7 @@ describe('tinyYolov2', () => {
    imgEl = await faceapi.bufferToImage(img)
  })

-  describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => {
+  describeWithNets('quantized weights', { withTinyYolov2: { quantized: true, withSeparableConv: false } }, ({ tinyYolov2 }) => {

    it('inputSize lg, finds all faces', async () => {
      const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
@@ -59,7 +58,7 @@ describe('tinyYolov2', () => {

  })

-  describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false } }, ({ tinyYolov2 }) => {
+  describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false, withSeparableConv: false } }, ({ tinyYolov2 }) => {

    it('inputSize lg, finds all faces', async () => {
      const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
@@ -113,7 +112,7 @@ describe('tinyYolov2', () => {
        await expectAllTensorsReleased(async () => {
          const res = await fetch('base/weights_uncompressed/tiny_yolov2_model.weights')
          const weights = new Float32Array(await res.arrayBuffer())
-          const net = faceapi.createTinyYolov2(weights)
+          const net = faceapi.createTinyYolov2(weights, false)
          net.dispose()
        })
      })
@@ -124,8 +123,8 @@ describe('tinyYolov2', () => {

      it('disposes all param tensors', async () => {
        await expectAllTensorsReleased(async () => {
-          const net = new faceapi.TinyYolov2()
-          await net.load('base/weights')
+          const net = new faceapi.TinyYolov2(false)
+          await net.load('base/weights_unused')
          net.dispose()
        })
      })

--- a/test/tests/e2e/tinyYolov2SeparableConv.test.ts
+++ b/test/tests/e2e/tinyYolov2SeparableConv.test.ts
+import * as faceapi from '../../../src';
+import { SizeType } from '../../../src/tinyYolov2/types';
+import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
+import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
+
+describe('tinyYolov2, with separable convolutions', () => {
+
+  let imgEl: HTMLImageElement
+
+  beforeAll(async () => {
+    const img = await (await fetch('base/test/images/faces.jpg')).blob()
+    imgEl = await faceapi.bufferToImage(img)
+  })
+
+  describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => {
+
+    it('inputSize lg, finds all faces', async () => {
+      const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
+
+      const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
+      const maxBoxDelta = 1
+      const boxOrder = [0, 1, 2, 3, 4, 5]
+
+      expect(detections.length).toEqual(6)
+      detections.forEach((det, i) => {
+        expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
+        expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
+      })
+    })
+
+    it('inputSize md, finds all faces', async () => {
+      const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.MD })
+
+      const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
+      const maxBoxDelta = 17
+      const boxOrder = [5, 1, 4, 3, 2, 0]
+
+      expect(detections.length).toEqual(6)
+      detections.forEach((det, i) => {
+        expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
+        expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
+      })
+    })
+
+    it('inputSize custom, finds all faces', async () => {
+      const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: 416 })
+
+      const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
+      const maxBoxDelta = 17
+      const boxOrder = [5, 1, 4, 3, 2, 0]
+
+      expect(detections.length).toEqual(6)
+      detections.forEach((det, i) => {
+        expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
+        expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
+      })
+    })
+
+  })
+
+  describe('no memory leaks', () => {
+
+    describe('NeuralNetwork, uncompressed model', () => {
+
+      it('disposes all param tensors', async () => {
+        await expectAllTensorsReleased(async () => {
+          const res = await fetch('base/weights_uncompressed/tiny_yolov2_separable_conv_model.weights')
+          const weights = new Float32Array(await res.arrayBuffer())
+          const net = faceapi.createTinyYolov2(weights)
+          net.dispose()
+        })
+      })
+
+    })
+
+    describe('NeuralNetwork, quantized model', () => {
+
+      it('disposes all param tensors', async () => {
+        await expectAllTensorsReleased(async () => {
+          const net = new faceapi.TinyYolov2()
+          await net.load('base/weights')
+          net.dispose()
+        })
+      })
+
+    })
+
+  })
+
+})
\ No newline at end of file
--- a/test/utils.ts
+++ b/test/utils.ts
@@ -4,9 +4,8 @@ import { IRect } from '../build/Rect';
 import * as faceapi from '../src/';
 import { NeuralNetwork } from '../src/commons/NeuralNetwork';
 import { IPoint } from '../src/';
-import { allFacesFactory, allFacesMtcnnFactory } from '../src/allFacesFactory';
-import { allFacesMtcnnFunction, allFacesFunction, tinyYolov2 } from '../src/globalApi';
-import { TinyYolov2 } from '../src/tinyYolov2/TinyYolov2';
+import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from '../src/allFacesFactory';
+import { allFacesMtcnnFunction, allFacesSsdMobilenetv1Function, allFacesTinyYolov2, allFacesTinyYolov2Function } from '../src/globalApi';

 export function zeros(length: number): Float32Array {
  return new Float32Array(length)
@@ -55,8 +54,13 @@ export type WithNetOptions = {
  quantized?: boolean
 }

+export type WithTinyYolov2Options = WithNetOptions & {
+  withSeparableConv?: boolean
+}
+
 export type InjectNetArgs = {
-  allFaces: allFacesFunction
+  allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function
+  allFacesTinyYolov2: allFacesTinyYolov2Function
  allFacesMtcnn: allFacesMtcnnFunction
  faceDetectionNet: faceapi.FaceDetectionNet
  faceLandmarkNet: faceapi.FaceLandmarkNet
@@ -67,13 +71,14 @@ export type InjectNetArgs = {


 export type DescribeWithNetsOptions = {
-  withAllFaces?: boolean
+  withAllFacesSsdMobilenetv1?: boolean
+  withAllFacesTinyYolov2?: boolean
  withAllFacesMtcnn?: boolean
  withFaceDetectionNet?: WithNetOptions
  withFaceLandmarkNet?: WithNetOptions
  withFaceRecognitionNet?: WithNetOptions
  withMtcnn?: WithNetOptions
-  withTinyYolov2?: WithNetOptions
+  withTinyYolov2?: WithTinyYolov2Options
 }

 async function loadNetWeights(uri: string): Promise<Float32Array> {
@@ -82,12 +87,13 @@ async function loadNetWeights(uri: string): Promise<Float32Array> {

 async function initNet<TNet extends NeuralNetwork<any>>(
  net: TNet,
-  uncompressedFilename: string | boolean
+  uncompressedFilename: string | boolean,
+  isUnusedModel: boolean = false
 ) {
  await net.load(
    uncompressedFilename
      ? await loadNetWeights(`base/weights_uncompressed/${uncompressedFilename}`)
-      : 'base/weights'
+      : (isUnusedModel ? 'base/weights_unused' : 'base/weights')
  )
 }

@@ -102,13 +108,15 @@ export function describeWithNets(
    let faceLandmarkNet: faceapi.FaceLandmarkNet = new faceapi.FaceLandmarkNet()
    let faceRecognitionNet: faceapi.FaceRecognitionNet = new faceapi.FaceRecognitionNet()
    let mtcnn: faceapi.Mtcnn = new faceapi.Mtcnn()
-    let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2()
-    let allFaces = allFacesFactory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet)
+    let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2(options.withTinyYolov2 && options.withTinyYolov2.withSeparableConv)
+    let allFacesSsdMobilenetv1 = allFacesSsdMobilenetv1Factory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet)
+    let allFacesTinyYolov2 = allFacesTinyYolov2Factory(tinyYolov2, faceLandmarkNet, faceRecognitionNet)
    let allFacesMtcnn = allFacesMtcnnFactory(mtcnn, faceRecognitionNet)

    beforeAll(async () => {
      const {
-        withAllFaces,
+        withAllFacesSsdMobilenetv1,
+        withAllFacesTinyYolov2,
        withAllFacesMtcnn,
        withFaceDetectionNet,
        withFaceLandmarkNet,
@@ -117,21 +125,21 @@ export function describeWithNets(
        withTinyYolov2
      } = options

-      if (withFaceDetectionNet || withAllFaces) {
+      if (withFaceDetectionNet || withAllFacesSsdMobilenetv1) {
        await initNet<faceapi.FaceDetectionNet>(
          faceDetectionNet,
          !!withFaceDetectionNet && !withFaceDetectionNet.quantized && 'ssd_mobilenetv1_model.weights'
        )
      }

-      if (withFaceLandmarkNet || withAllFaces) {
+      if (withFaceLandmarkNet || withAllFacesSsdMobilenetv1 || withAllFacesTinyYolov2) {
        await initNet<faceapi.FaceLandmarkNet>(
          faceLandmarkNet,
          !!withFaceLandmarkNet && !withFaceLandmarkNet.quantized && 'face_landmark_68_model.weights'
        )
      }

-      if (withFaceRecognitionNet || withAllFaces || withAllFacesMtcnn) {
+      if (withFaceRecognitionNet || withAllFacesSsdMobilenetv1 || withAllFacesMtcnn || withAllFacesTinyYolov2) {
        await initNet<faceapi.FaceRecognitionNet>(
          faceRecognitionNet,
          // TODO: figure out why quantized weights results in NaNs in testcases
@@ -146,10 +154,11 @@ export function describeWithNets(
        )
      }

-      if (withTinyYolov2) {
+      if (withTinyYolov2 || withAllFacesTinyYolov2) {
        await initNet<faceapi.TinyYolov2>(
          tinyYolov2,
-          !!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights'
+          !!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights',
+          withTinyYolov2 && withTinyYolov2.withSeparableConv === false
        )
      }
    })
@@ -162,7 +171,16 @@ export function describeWithNets(
      tinyYolov2 && tinyYolov2.dispose()
    })

-    specDefinitions({ allFaces, allFacesMtcnn, faceDetectionNet, faceLandmarkNet, faceRecognitionNet, mtcnn, tinyYolov2 })
+    specDefinitions({
+      allFacesSsdMobilenetv1,
+      allFacesTinyYolov2,
+      allFacesMtcnn,
+      faceDetectionNet,
+      faceLandmarkNet,
+      faceRecognitionNet,
+      mtcnn,
+      tinyYolov2
+    })
  })
 }

--- a/tools/quantize/server.js
+++ b/tools/quantize/server.js
@@ -9,9 +9,10 @@ app.use(express.static(path.join(__dirname, './public')))
 app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
 app.use(express.static(path.join(__dirname, '../../examples/public')))
 app.use(express.static(path.join(__dirname, '../../weights')))
+app.use(express.static(path.join(__dirname, '../../weights_uncompressed')))
 app.use(express.static(path.join(__dirname, '../../dist')))

-app.get('/', (req, res) => res.redirect('/quantize_tiny_yolov2'))
-app.get('/quantize_tiny_yolov2', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeTinyYolov2.html')))
+app.get('/', (req, res) => res.redirect('/quantize_model'))
+app.get('/quantize_model', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeModel.html')))

 app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
--- a/tools/quantize/views/quantizeTinyYolov2.html
+++ b/tools/quantize/views/quantizeTinyYolov2.html
@@ -10,6 +10,10 @@
  <script>
    tf = faceapi.tf

+    const modelName = 'tiny_yolov2_separable_conv'
+    const uncompressedWeightsUri = `tiny_yolov2_separable_conv_model_v1.weights`
+    const net = new faceapi.TinyYolov2(true)
+
    async function loadNetWeights(uri) {
      return new Float32Array(await (await fetch(uri)).arrayBuffer())
    }
@@ -36,15 +40,13 @@
    }

    async function quantizeAndSave() {
-      const modelName = 'tiny_yolov2'

-      const tinyYolov2 = new faceapi.TinyYolov2()
-      await tinyYolov2.load(await loadNetWeights(`uncompressed/tiny_yolov2_model.weights`))
+      await net.load(await loadNetWeights(uncompressedWeightsUri))

      const quantizedTensorArrays = []
      const weightEntries = []

-      tinyYolov2.getParamList().forEach(({ path, tensor }) => {
+      net.getParamList().forEach(({ path, tensor }) => {
        const { scale, min, qdata } = quantizeWeights(tensor)

        const weightEntry = {

--- a/tools/train/public/faceLandmarksTrain.js
+++ b/tools/train/public/faceLandmarksTrain.js
-async function promiseSequential(promises) {
-  const curr = promises[0]
-  if (!curr) {
-    return
-  }
-
-  await curr()
-  return promiseSequential(promises.slice(1))
-}
-
 async function trainStep(batchCreators) {
  await promiseSequential(batchCreators.map((batchCreator, dataIdx) => async () => {

    const { batchInput, landmarksBatchTensor } = await batchCreator()
+
    let ts = Date.now()
    const cost = optimizer.minimize(() => {
      const out = window.trainNet.forwardInput(batchInput.managed())
@@ -19,7 +10,7 @@ async function trainStep(batchCreators) {
        landmarksBatchTensor,
        out
      )
-      return loss
+      return tf.sum(out)
    }, true)

    ts = Date.now() - ts
@@ -29,6 +20,7 @@ async function trainStep(batchCreators) {
    cost.dispose()

    await tf.nextFrame()
+    console.log(tf.memory())
  }))
 }

@@ -73,7 +65,7 @@ function landmarkPositionsToArray(landmarks) {
 }

 function toFaceLandmarks(landmarks, { naturalWidth, naturalHeight }) {
-  return new faceapi.FaceLandmarks(
+  return new faceapi.FaceLandmarks68(
    landmarks.map(l => new faceapi.Point(l.x / naturalWidth, l.y / naturalHeight)),
    { width: naturalWidth, height: naturalHeight }
  )
@@ -100,20 +92,11 @@ async function getTrainData() {
    (_, i) => landmarksJson[i]
  )

+  return await loadImagesInBatch(allLandmarks.slice(0, 100))
+/**
  const batch1 = await loadImagesInBatch(allLandmarks.slice(0, 4000))
  const batch2 = await loadImagesInBatch(allLandmarks.slice(4000), 4000)

  return batch1.concat(batch2)
-}
-
-// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
-function shuffle(a) {
-  var j, x, i;
-  for (i = a.length - 1; i > 0; i--) {
-      j = Math.floor(Math.random() * (i + 1));
-      x = a[i];
-      a[i] = a[j];
-      a[j] = x;
-  }
-  return a;
+  */
 }
\ No newline at end of file
--- a/tools/train/public/faceLandmarksUi.js
+++ b/tools/train/public/faceLandmarksUi.js
@@ -60,7 +60,7 @@ async function init() {
  //window.nets.push(await loadNet('retrained/landmarks_v0.weights'))
  //window.nets.push(await loadNet('retrained/landmarks_v2.weights'))

-  window.trainNet = await loadNet('retrained/landmarks_v6.weights')
+  window.trainNet = await loadNet('/tmp/retrained/landmarks_v9.weights')
  window.nets.push(trainNet)

  $('#loader').hide()

--- a/tools/train/views/faceLandmarks.html
+++ b/tools/train/views/faceLandmarks.html
@@ -10,6 +10,7 @@
  <script src="faceLandmarksUi.js"></script>
  <script src="faceLandmarksTrain.js"></script>
  <script src="FileSaver.js"></script>
+  <script src="trainUtils.js"></script>
 </head>
 <body>
  <div id="navbar"></div>
@@ -60,7 +61,7 @@
      await train()
    }

-    async function train(batchSize = 10) {
+    async function train(batchSize = 1) {
      for (let i = 0; i < trainSteps; i++) {
        console.log('step', i)
        const batchCreators = createBatchCreators(shuffle(window.trainData), batchSize)
@@ -69,20 +70,11 @@
        ts = Date.now() - ts
        console.log('step %s done (%s ms)', i, ts)
        if (((i + 1) % saveEveryNthIteration) === 0) {
-          saveWeights(i)
+          //saveWeights(window.trainNet, 'landmark_trained_weights_' + idx + '.weights')
        }
      }
    }

-    function saveWeights(idx = 0) {
-      const binaryWeights = new Float32Array(
-        window.trainNet.getParamList()
-          .map(({ tensor }) => Array.from(tensor.dataSync()))
-          .reduce((flat, arr) => flat.concat(arr))
-      )
-      saveAs(new Blob([binaryWeights]), 'landmark_trained_weights_' + idx + '.weights')
-    }
-
  </script>

 </body>

--- a/tools/train/karma.conf.js
+++ b/tools/train/karma.conf.js
+module.exports = function(config) {
+  config.set({
+    frameworks: ['jasmine', 'karma-typescript'],
+    files: [
+      'tinyYolov2/**/*.ts'
+    ],
+    preprocessors: {
+      '**/*.ts': ['karma-typescript']
+    },
+    karmaTypescriptConfig: {
+      tsconfig: './tsconfig.test.json'
+    },
+    browsers: ['Chrome'],
+    browserNoActivityTimeout: 60000,
+    client: {
+      jasmine: {
+        timeoutInterval: 30000
+      }
+    }
+  })
+}
--- a/tools/train/package.json
+++ b/tools/train/package.json
 {
  "scripts": {
-    "start": "node server.js"
+    "start": "node server.js",
+    "test": "karma start"
  },
  "author": "justadudewhohacks",
  "license": "MIT",

--- a/tools/train/server.js
+++ b/tools/train/server.js
-require('./.env')
+require('./faceLandmarks/.env')

 const express = require('express')
 const path = require('path')

 const app = express()

-const viewsDir = path.join(__dirname, 'views')
-app.use(express.static(viewsDir))
-app.use(express.static(path.join(__dirname, './public')))
-app.use(express.static(path.join(__dirname, './tmp')))
+const publicDir = path.join(__dirname, './faceLandmarks')
+app.use(express.static(publicDir))
+app.use(express.static(path.join(__dirname, './shared')))
 app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
 app.use(express.static(path.join(__dirname, '../../examples/public')))
 app.use(express.static(path.join(__dirname, '../../weights')))
@@ -18,6 +17,6 @@ const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
 app.use(express.static(trainDataPath))

 app.get('/', (req, res) => res.redirect('/face_landmarks'))
-app.get('/face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'faceLandmarks.html')))
+app.get('/face_landmarks', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))

 app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
--- a/tools/train/serveTinyYolov2.js
+++ b/tools/train/serveTinyYolov2.js
+require('./tinyYolov2/.env')
+
+const express = require('express')
+const path = require('path')
+const fs = require('fs')
+
+const app = express()
+
+const publicDir = path.join(__dirname, './tinyYolov2')
+app.use(express.static(publicDir))
+app.use(express.static(path.join(__dirname, './shared')))
+app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
+app.use(express.static(path.join(__dirname, '../../examples/public')))
+app.use(express.static(path.join(__dirname, '../../weights')))
+app.use(express.static(path.join(__dirname, '../../dist')))
+
+const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
+const testDataPath = path.resolve(process.env.TEST_DATA_PATH)
+const imagesPath = path.join(trainDataPath, './final_images')
+const detectionsPath = path.join(trainDataPath, './final_detections')
+app.use(express.static(imagesPath))
+app.use(express.static(detectionsPath))
+app.use(express.static(testDataPath))
+
+const detectionFilenames = fs.readdirSync(detectionsPath)
+const detectionFilenamesMultibox = JSON.parse(fs.readFileSync(path.join(__dirname, './tinyYolov2/multibox.json')))
+
+app.use(express.static(trainDataPath))
+
+app.get('/detection_filenames', (req, res) => res.status(202).send(detectionFilenames))
+app.get('/detection_filenames_multibox', (req, res) => res.status(202).send(detectionFilenamesMultibox))
+app.get('/', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))
+app.get('/verify', (req, res) => res.sendFile(path.join(publicDir, 'verify.html')))
+app.get('/test', (req, res) => res.sendFile(path.join(publicDir, 'test.html')))
+
+app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
--- a/tools/train/shared/trainUtils.js
+++ b/tools/train/shared/trainUtils.js
+const log = (str, ...args) => console.log(`[${[(new Date()).toTimeString().substr(0, 8)]}] ${str || ''}`, ...args)
+
+async function promiseSequential(promises) {
+  const curr = promises[0]
+  if (!curr) {
+    return
+  }
+
+  await curr()
+  return promiseSequential(promises.slice(1))
+}
+
+// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
+function shuffle(a) {
+  var j, x, i;
+  for (i = a.length - 1; i > 0; i--) {
+      j = Math.floor(Math.random() * (i + 1));
+      x = a[i];
+      a[i] = a[j];
+      a[j] = x;
+  }
+  return a;
+}
+
+function saveWeights(net, filename = 'train_tmp') {
+  const binaryWeights = new Float32Array(
+    net.getParamList()
+      .map(({ tensor }) => Array.from(tensor.dataSync()))
+      .reduce((flat, arr) => flat.concat(arr))
+  )
+  saveAs(new Blob([binaryWeights]), filename)
+}
+
+function imageToSquare(img) {
+  const scale = 608 / Math.max(img.height, img.width)
+  const width = scale * img.width
+  const height = scale * img.height
+
+  const canvas1 = faceapi.createCanvasFromMedia(img)
+  const targetCanvas = faceapi.createCanvas({ width: 608, height: 608 })
+  targetCanvas.getContext('2d').putImageData(canvas1.getContext('2d').getImageData(0, 0, width, height), 0, 0)
+  return targetCanvas
+}
+
+function getPaddingsAndReshapedSize(img, inputSize) {
+  const [h, w] = [img.height, img.width]
+  const maxDim = Math.max(h, w)
+
+  const f = inputSize / maxDim
+  const reshapedImgDims = {
+    height: Math.floor(h * f),
+    width: Math.floor(w * f)
+  }
+
+  const paddings = new faceapi.Point(
+    maxDim / img.width,
+    maxDim / img.height
+  )
+
+  return { paddings, reshapedImgDims }
+}
\ No newline at end of file
--- a/tools/train/tinyYolov2/initWeights.html
+++ b/tools/train/tinyYolov2/initWeights.html
+<!DOCTYPE html>
+<html>
+<head>
+  <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.12.0"> </script>
+  <script src="FileSaver.js"></script>
+</head>
+<body>
+  <script>
+
+    const glorotNormal = tf.initializers.glorotNormal()
+
+    saveWeights()
+
+    function initSeparableConvWeights(inChannels, outChannels) {
+      return {
+        depthwiseFilter: glorotNormal.apply([3, 3, inChannels, 1]),
+        pointwiseFilter: glorotNormal.apply([1, 1, inChannels, outChannels]),
+        bias: tf.zeros([outChannels])
+      }
+    }
+
+    function initTinyYolov2SeparableWeights() {
+
+      const conv0 = initSeparableConvWeights(3, 16)
+      const conv1 = initSeparableConvWeights(16, 32)
+      const conv2 = initSeparableConvWeights(32, 64)
+      const conv3 = initSeparableConvWeights(64, 128)
+      const conv4 = initSeparableConvWeights(128, 256)
+      const conv5 = initSeparableConvWeights(256, 512)
+      const conv6 = initSeparableConvWeights(512, 1024)
+      const conv7 = initSeparableConvWeights(1024, 1024)
+      const conv8 = {
+        filters: glorotNormal.apply([1, 1, 1024, 25]),
+        bias: tf.zeros([25])
+      }
+
+      return {
+        conv0,
+        conv1,
+        conv2,
+        conv3,
+        conv4,
+        conv5,
+        conv6,
+        conv7,
+        conv8
+      }
+    }
+
+    function saveWeights() {
+      const w = initTinyYolov2SeparableWeights()
+      const binaryWeights = new Float32Array(
+        Array(8).fill(0)
+          .map((_, i) => w[`conv${i}`])
+          .map(ps => [ps.depthwiseFilter, ps.pointwiseFilter, ps.bias])
+          .reduce((flat, arr) => flat.concat(arr))
+          .concat([w.conv8.filters, w.conv8.bias])
+          .map(tensor => console.log(tensor) || Array.from(tensor.dataSync()))
+          .reduce((flat, arr) => flat.concat(arr))
+      )
+      saveAs(new Blob([binaryWeights]), 'foo.weights')
+    }
+  </script>
+</body>
+</html>
\ No newline at end of file
--- a/tools/train/tinyYolov2/loss.js
+++ b/tools/train/tinyYolov2/loss.js
+const CELL_SIZE = 32
+
+const getNumCells = inputSize => inputSize / CELL_SIZE
+
+const inverseSigmoid = x => Math.log(x / (1 - x))
+
+function getAnchors() {
+  return window.net.anchors
+}
+
+function squaredSumOverMask(lossTensors, mask) {
+  return tf.tidy(() => tf.sum(tf.square(tf.mul(mask, lossTensors))))
+}
+
+function assignBoxesToAnchors(groundTruthBoxes, reshapedImgDims) {
+
+  const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
+  const numCells = getNumCells(inputSize)
+
+  return groundTruthBoxes.map(box => {
+    const { left, top, width, height } = box.rescale(reshapedImgDims)
+
+    const ctX = left + (width / 2)
+    const ctY = top + (height / 2)
+
+    const col = Math.floor((ctX / inputSize) * numCells)
+    const row = Math.floor((ctY / inputSize) * numCells)
+
+    const anchorsByIou = getAnchors().map((anchor, idx) => ({
+      idx,
+      iou: faceapi.iou(
+        new faceapi.BoundingBox(0, 0, anchor.x * CELL_SIZE, anchor.y * CELL_SIZE),
+        new faceapi.BoundingBox(0, 0, width, height)
+      )
+    })).sort((a1, a2) => a2.iou - a1.iou)
+
+    const anchor = anchorsByIou[0].idx
+
+    return { row, col, anchor, box }
+  })
+}
+
+function getGroundTruthMask(groundTruthBoxes, inputSize) {
+
+  const numCells = getNumCells(inputSize)
+
+  const mask = tf.zeros([numCells, numCells, 25])
+  const buf = mask.buffer()
+
+  groundTruthBoxes.forEach(({ row, col, anchor }) => {
+    const anchorOffset = anchor * 5
+    for (let i = 0; i < 5; i++) {
+      buf.set(1, row, col, anchorOffset + i)
+    }
+  })
+
+   return mask
+}
+
+function getCoordAndScoreMasks(inputSize) {
+
+  const numCells = getNumCells(inputSize)
+
+  const coordMask = tf.zeros([numCells, numCells, 25])
+  const scoreMask = tf.zeros([numCells, numCells, 25])
+  const coordBuf = coordMask.buffer()
+  const scoreBuf = scoreMask.buffer()
+
+  for (let row = 0; row < numCells; row++) {
+    for (let col = 0; col < numCells; col++) {
+      for (let anchor = 0; anchor < 5; anchor++) {
+        const anchorOffset = 5 * anchor
+        for (let i = 0; i < 4; i++) {
+          coordBuf.set(1, row, col, anchorOffset + i)
+        }
+        scoreBuf.set(1, row, col, anchorOffset + 4)
+      }
+    }
+  }
+
+  return { coordMask, scoreMask }
+}
+
+function computeBoxAdjustments(groundTruthBoxes, reshapedImgDims) {
+
+  const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
+  const numCells = getNumCells(inputSize)
+
+  const adjustments = tf.zeros([numCells, numCells, 25])
+  const buf = adjustments.buffer()
+
+  groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
+    const { left, top, right, bottom, width, height } = box.rescale(reshapedImgDims)
+
+    const centerX = (left + right) / 2
+    const centerY = (top + bottom) / 2
+
+    //const dCenterX = centerX - (col * CELL_SIZE + (CELL_SIZE / 2))
+    //const dCenterY = centerY - (row * CELL_SIZE + (CELL_SIZE / 2))
+    const dCenterX = centerX - (col * CELL_SIZE)
+    const dCenterY = centerY - (row * CELL_SIZE)
+
+    const dx = inverseSigmoid(dCenterX / CELL_SIZE)
+    const dy = inverseSigmoid(dCenterY / CELL_SIZE)
+    //const dx = dCenterX / CELL_SIZE
+    //const dy = dCenterY / CELL_SIZE
+    const dw = Math.log((width / CELL_SIZE) / getAnchors()[anchor].x)
+    const dh = Math.log((height / CELL_SIZE) / getAnchors()[anchor].y)
+
+    const anchorOffset = anchor * 5
+    buf.set(dx, row, col, anchorOffset + 0)
+    buf.set(dy, row, col, anchorOffset + 1)
+    buf.set(dw, row, col, anchorOffset + 2)
+    buf.set(dh, row, col, anchorOffset + 3)
+  })
+
+  return adjustments
+}
+
+function computeIous(predBoxes, groundTruthBoxes, reshapedImgDims) {
+
+  const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
+  const numCells = getNumCells(inputSize)
+
+  const isSameAnchor = p1 => p2 =>
+    p1.row === p2.row
+      && p1.col === p2.col
+      && p1.anchor === p2.anchor
+
+  const ious = tf.zeros([numCells, numCells, 25])
+  const buf = ious.buffer()
+
+  groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
+    const predBox = predBoxes.find(isSameAnchor({ row, col, anchor }))
+
+    if (!predBox) {
+      console.log(groundTruthBoxes)
+      console.log(predBoxes)
+      throw new Error(`no output box found for: row ${row}, col ${col}, anchor ${anchor}`)
+    }
+
+    const iou = faceapi.iou(
+      box.rescale(reshapedImgDims),
+      predBox.box.rescale(reshapedImgDims)
+    )
+
+    if (window.debug) {
+      console.log('ground thruth box:', box.rescale(reshapedImgDims).toRect())
+      console.log('predicted box:', predBox.box.rescale(reshapedImgDims).toRect())
+      console.log('predicted score:', predBox.score)
+      console.log('iou:', iou)
+    }
+
+    const anchorOffset = anchor * 5
+    buf.set(iou, row, col, anchorOffset + 4)
+  })
+
+  return ious
+}
+
+window.computeNoObjectLoss = function(outTensor, mask) {
+  return tf.tidy(() => {
+    const lossTensor = tf.sigmoid(outTensor)
+    return squaredSumOverMask(lossTensor, mask)
+  })
+}
+
+function computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, mask) {
+  return tf.tidy(() => {
+    const predBoxes = window.net.postProcess(
+      outTensor,
+      { paddings }
+    )
+
+    const ious = computeIous(
+      predBoxes,
+      groundTruthBoxes,
+      reshapedImgDims
+    )
+
+    const lossTensor = tf.sub(ious, tf.sigmoid(outTensor))
+
+    return squaredSumOverMask(lossTensor, mask)
+  })
+}
+
+function computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, mask, paddings) {
+  return tf.tidy(() => {
+    const boxAdjustments = computeBoxAdjustments(
+      groundTruthBoxes,
+      reshapedImgDims
+    )
+
+    if (window.debug) {
+      const indToPos = []
+      const numCells = outTensor.shape[1]
+      for (let row = 0; row < numCells; row++) {
+        for (let col = 0; col < numCells; col++) {
+          for (let anchor = 0; anchor < 25; anchor++) {
+            indToPos.push({ row, col, anchor: parseInt(anchor / 5) })
+          }
+        }
+      }
+
+      const indices = Array.from(mask.dataSync()).map((val, ind) => ({ val, ind })).filter(v => v.val !== 0).map(v => v.ind)
+      const gt = Array.from(boxAdjustments.dataSync())
+      const out = Array.from(outTensor.dataSync())
+
+      const comp = indices.map(i => (
+        {
+          pos: indToPos[i],
+          gt: gt[i],
+          out: out[i]
+        }
+      ))
+      console.log(comp.map(c => `gt: ${c.gt}, out: ${c.out}`))
+
+      const getBbox = (which) => {
+        const { row, col, anchor } = comp[0].pos
+
+        const ctX = ((col + faceapi.sigmoid(comp[0][which])) / numCells) * paddings.x
+        const ctY = ((row + faceapi.sigmoid(comp[1][which])) / numCells) * paddings.y
+        const width = ((Math.exp(comp[2][which]) * getAnchors()[anchor].x) / numCells) * paddings.x
+        const height = ((Math.exp(comp[3][which]) * getAnchors()[anchor].y) / numCells) * paddings.y
+
+        const x = (ctX - (width / 2))
+        const y = (ctY - (height / 2))
+
+        return new faceapi.BoundingBox(x, y, x + width, y + height)
+      }
+
+      const outRect = getBbox('out').rescale(reshapedImgDims).toRect()
+      const gtRect = getBbox('gt').rescale(reshapedImgDims).toRect()
+      console.log('out', outRect)
+      console.log('gtRect', gtRect)
+    }
+
+
+    const lossTensor = tf.sub(boxAdjustments, outTensor)
+
+    return squaredSumOverMask(lossTensor, mask)
+  })
+}
+
+function computeLoss(outTensor, groundTruth, reshapedImgDims, paddings) {
+
+  const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
+
+  if (!inputSize) {
+    throw new Error(`invalid inputSize: ${inputSize}`)
+  }
+
+  let groundTruthBoxes = assignBoxesToAnchors(
+    groundTruth
+      .map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
+      .map(rect => rect.toBoundingBox()),
+    reshapedImgDims
+  )
+
+  const groundTruthMask = getGroundTruthMask(groundTruthBoxes, inputSize)
+  const { coordMask, scoreMask } = getCoordAndScoreMasks(inputSize)
+
+  const noObjectLossMask = tf.tidy(() => tf.mul(scoreMask, tf.sub(tf.scalar(1), groundTruthMask)))
+  const objectLossMask = tf.tidy(() => tf.mul(scoreMask, groundTruthMask))
+  const coordLossMask = tf.tidy(() => tf.mul(coordMask, groundTruthMask))
+
+  const noObjectLoss = tf.tidy(() =>
+    tf.mul(
+      tf.scalar(noObjectScale),
+      computeNoObjectLoss(outTensor, noObjectLossMask)
+    )
+  )
+  const objectLoss = tf.tidy(() =>
+    tf.mul(
+      tf.scalar(objectScale),
+      computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, objectLossMask)
+    )
+  )
+
+  const coordLoss = tf.tidy(() =>
+    tf.mul(
+      tf.scalar(coordScale),
+      computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, coordLossMask, paddings)
+    )
+  )
+
+  const totalLoss = tf.tidy(() => noObjectLoss.add(objectLoss).add(coordLoss))
+
+  return {
+    noObjectLoss,
+    objectLoss,
+    coordLoss,
+    totalLoss
+  }
+}
\ No newline at end of file
--- a/tools/train/tinyYolov2/loss.test.ts
+++ b/tools/train/tinyYolov2/loss.test.ts
+import * as _tf from '@tensorflow/tfjs-core';
+
+const faceapi = require('../../../dist/face-api.js')
+const tf: typeof _tf = faceapi.tf
+require('./loss')
+
+window['faceapi'] = faceapi
+window['tf'] = tf
+
+const anchors = [
+  new faceapi.Point(1.603231, 2.094468),
+  new faceapi.Point(6.041143, 7.080126),
+  new faceapi.Point(2.882459, 3.518061),
+  new faceapi.Point(4.266906, 5.178857),
+  new faceapi.Point(9.041765, 10.66308)
+]
+
+window['net'] = {
+  getAnchors() {
+    return anchors
+  }
+}
+
+describe('loss', () => {
+
+  describe('computeNoObjectLoss', () => {
+
+    const computeNoObjectLoss = window['computeNoObjectLoss']
+
+    it('should only compute loss over scores, 1x1 grid', () => tf.tidy(() => {
+      const outTensor = tf.zeros([1, 1, 1, 25])
+      const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
+
+      expect(loss).toEqual(0.5 * 0.5 * 5)
+    }))
+
+    it('should only compute loss over scores, 13x13 grid', () => tf.tidy(() => {
+      const outTensor = tf.zeros([1, 13, 13, 25])
+      const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
+
+      expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13)
+    }))
+
+    it('should only compute loss over scores, 13x13 grid, batchSize: 10', () => tf.tidy(() => {
+      const outTensor = tf.zeros([10, 13, 13, 25])
+      const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
+
+      expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13 * 10)
+    }))
+
+  })
+})
\ No newline at end of file
--- a/tools/train/tinyYolov2/multibox.json
+++ b/tools/train/tinyYolov2/multibox.json
--- a/tools/train/tinyYolov2/test.html
+++ b/tools/train/tinyYolov2/test.html
+<!DOCTYPE html>
+<html>
+<head>
+  <script src="face-api.js"></script>
+  <script src="commons.js"></script>
+  <script src="trainUtils.js"></script>
+  <script src="loss.js"></script>
+  <script src="FileSaver.js"></script>
+  <link rel="stylesheet" href="styles.css">
+  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
+  <script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
+</head>
+<body>
+  <script>
+    window.tf = faceapi.tf
+
+    // hyper parameters
+    window.objectScale = 5
+    window.noObjectScale = 1
+    window.coordScale = 1
+
+    const weightsUrl = `/tmp/tmp__224_35060__320_41188__416_31050__608_16520.weights`
+    //const inputSizes = [160, 224, 320, 416]
+    const inputSizes = [512, 608]
+
+    async function loadNetWeights(uri) {
+      return new Float32Array(await (await fetch(uri)).arrayBuffer())
+    }
+
+    async function fetchFddbJson() {
+      return fetch('/fddb-detections.json').then(res => res.json())
+    }
+
+    async function run() {
+      window.fddbJson = await fetchFddbJson()
+
+      const weights = await loadNetWeights(weightsUrl)
+      window.net = new faceapi.TinyYolov2(true)
+      await window.net.load(weights)
+
+      await promiseSequential(inputSizes.map(inputSize => async () => {
+        await promiseSequential(window.fddbJson.map(({ filePath, rects }) => () => {
+          return test(filePath, rects, inputSize)
+        }))
+
+        const losses = Object.keys(window.lossMap[inputSize]).map(k => window.lossMap[inputSize][k])
+
+        const totalLoss = losses
+          .map(l => l.totalLoss)
+          .reduce((sum, l) => sum + l)
+        const avgLoss = totalLoss / losses.length
+
+        log(`totalLoss (${inputSize}): ${totalLoss}`)
+        log(`avgLoss (${inputSize}): ${avgLoss}`)
+
+        window.losses = window.losses || {}
+        window.losses[inputSize] = { totalLoss, avgLoss }
+      }))
+
+      console.log(window.losses)
+    }
+
+    async function test(fileUri, rects, inputSize) {
+      const img = await faceapi.bufferToImage(await fetchImage(fileUri))
+
+      const groundTruthBoxes = rects
+        .map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
+        .map(rect => rect.clipAtImageBorders(img.width, img.height))
+        .map(({ x, y, width, height }) => ({
+          x: x / img.width,
+          y: y / img.height,
+          width: width / img.width,
+          height: height / img.height,
+        }))
+
+      const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
+      const squareImg = imageToSquare(img)
+
+      const netInput = (await faceapi.toNetInput(squareImg)).managed()
+
+      const losses = tf.tidy(() => {
+        const outTensor = window.net.forwardInput(netInput, inputSize)
+
+        const {
+          noObjectLoss,
+          objectLoss,
+          coordLoss,
+          totalLoss
+        } = computeLoss(
+          outTensor,
+          groundTruthBoxes,
+          reshapedImgDims,
+          paddings
+        )
+
+        const losses = {
+          totalLoss: totalLoss.dataSync()[0],
+          noObjectLoss: noObjectLoss.dataSync()[0],
+          objectLoss: objectLoss.dataSync()[0],
+          coordLoss: coordLoss.dataSync()[0]
+        }
+
+        return losses
+      })
+
+      log(`${fileUri}:`)
+      log(`ground truth boxes: ${groundTruthBoxes.length}`)
+      log(`noObjectLoss: ${losses.noObjectLoss}`)
+      log(`objectLoss: ${losses.objectLoss}`)
+      log(`coordLoss: ${losses.coordLoss}`)
+      log(`totalLoss: ${losses.totalLoss}`)
+
+      if (Object.keys(losses).map(k => losses[k]).some(loss => isNaN(loss) || loss === Infinity)) {
+        console.log(groundTruthBoxes)
+        console.log(img)
+        console.log(losses)
+        throw new Error('corrupted loss value')
+      }
+      window.lossMap = window.lossMap || {}
+      window.lossMap[inputSize] = window.lossMap[inputSize] || {}
+      window.lossMap[inputSize][fileUri] = losses
+    }
+
+    $(document).ready(function() {
+      run()
+    })
+  </script>
+</body>
+</html>
\ No newline at end of file
--- a/tools/train/tinyYolov2/train.html
+++ b/tools/train/tinyYolov2/train.html
+<!DOCTYPE html>
+<html>
+<head>
+  <link rel="stylesheet" href="styles.css">
+  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
+  <script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
+  <script src="face-api.js"></script>
+  <script src="commons.js"></script>
+  <script src="FileSaver.js"></script>
+  <script src="trainUtils.js"></script>
+  <script src="train.js"></script>
+  <script src="loss.js"></script>
+</head>
+<body>
+
+  <script>
+    tf = faceapi.tf
+
+    const startIdx160 = 26600
+    const startIdx224 = 61660
+    const startIdx320 = 67788
+    const startIdx416 = 57650
+    const startIdx608 = 16520
+
+    //const weightsUrl = `/tmp/tmp__160_${startIdx160}__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
+    const weightsUrl = `/tmp/tmp_multiscale_count_8700.weights`
+
+    const fromEpoch = 0
+
+    const trainOnlyMultibox = false
+
+    const trainSizes = [416, 512, 608]
+    //const trainSizes = [608]
+
+    window.debug = false
+    window.logTrainSteps = true
+    window.count = 0
+
+
+    // hyper parameters
+    window.objectScale = 5
+    window.noObjectScale = 1
+    window.coordScale = 1
+
+    const rescaleEveryNthBatch = 100
+    window.saveEveryNthDataIdx = trainSizes.length * rescaleEveryNthBatch
+    window.trainSteps = 4000
+    //window.optimizer = tf.train.sgd(0.001)
+    window.optimizer = tf.train.adam(0.001, 0.9, 0.999, 1e-8)
+
+    // all samples
+    //const dataStartIdx = 8000
+    const dataStartIdx = 0
+    const numTrainSamples = Infinity
+
+    async function loadNetWeights(uri) {
+      return new Float32Array(await (await fetch(uri)).arrayBuffer())
+    }
+
+    async function fetchDetectionFilenames() {
+      return fetch('/detection_filenames').then(res => res.json())
+    }
+
+    async function fetchDetectionFilenamesMultibox() {
+      return fetch('/detection_filenames_multibox').then(res => res.json())
+    }
+
+    async function run() {
+      const weights = await loadNetWeights(weightsUrl)
+      window.net = new faceapi.TinyYolov2(true)
+      window.net.load(weights)
+      window.net.variable()
+
+      const fetchDetectionsFn = trainOnlyMultibox
+        ? fetchDetectionFilenamesMultibox
+        : fetchDetectionFilenames
+
+      window.detectionFilenames = (await fetchDetectionsFn()).slice(dataStartIdx, dataStartIdx + numTrainSamples)
+
+      window.lossMap = {}
+
+      console.log('ready')
+    }
+
+    function logLossChange(lossType) {
+      const { currentLoss, prevLoss, detectionFilenames } = window
+      log(`${lossType} : ${faceapi.round(currentLoss[lossType])} (avg: ${faceapi.round(currentLoss[lossType] / detectionFilenames.length)}) (delta: ${currentLoss[lossType] - prevLoss[lossType]})`)
+    }
+
+
+    function onBatchProcessed(dataIdx, inputSize) {
+      window.count++
+      const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
+      console.log('dataIdx', dataIdx)
+      if ((window.count % saveEveryNthDataIdx) === 0) {
+        saveWeights(window.net, `tmp_multiscale_count_${window.count}.weights`)
+      }
+    }
+
+    function _onBatchProcessed(dataIdx, inputSize) {
+      const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
+      console.log('idx', idx)
+      if ((idx % saveEveryNthDataIdx) === 0) {
+        saveWeights(window.net, `tmp__224_${startIdx224 + (inputSize === 224 ? idx : 0)}__320_${startIdx320 + (inputSize === 320 ? idx : 0)}__416_${startIdx416 + (inputSize === 416 ? idx : 0)}__608_${startIdx608 + (inputSize === 608 ? idx : 0)}.weights`)
+      }
+    }
+
+    async function train() {
+
+      const batchSize = 1
+
+      for (let i = fromEpoch; i < trainSteps; i++) {
+        window.epoch = i
+        log('step', i)
+        let ts2 = Date.now()
+
+        const batchCreators = createBatchCreators(shuffle(window.detectionFilenames), batchSize)
+
+        await trainStep(batchCreators, trainSizes, rescaleEveryNthBatch, onBatchProcessed)
+
+        ts2 = Date.now() - ts2
+      }
+
+      ts = Date.now() - ts
+
+      log()
+      log('--------------------')
+      log()
+      log('step %s done (%s ms)', i, ts)
+
+      window.prevLoss = window.currentLoss
+      window.currentLoss = Object.keys(lossMap)
+        .map(filename => lossMap[filename])
+        .reduce((accumulatedLosses, losses) =>
+          Object.keys(losses)
+            .map(key => ({
+              [key]: (accumulatedLosses[key] || 0) + losses[key]
+            }))
+            .reduce((map, curr) => ({ ...map, ...curr }), {}),
+          {}
+        )
+
+      if (window.prevLoss) {
+        logLossChange('noObjectLoss')
+        logLossChange('objectLoss')
+        logLossChange('coordLoss')
+        logLossChange('totalLoss')
+      }
+      log()
+      log('--------------------')
+      log()
+    }
+
+    run()
+
+  </script>
+
+</body>
+</html>
\ No newline at end of file
--- a/tools/train/tinyYolov2/train.js
+++ b/tools/train/tinyYolov2/train.js
+const batchIdx = 0
+
+function minimize(groundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings }) {
+  const filename = batch.filenames[batchIdx]
+  const { dataIdx } = batch
+
+  return optimizer.minimize(() => {
+
+    const outTensor = window.net.forwardInput(batchInput, inputSize)
+
+    const {
+      noObjectLoss,
+      objectLoss,
+      coordLoss,
+      totalLoss
+    } = computeLoss(
+      outTensor,
+      groundTruthBoxes,
+      reshapedImgDims,
+      paddings
+    )
+
+    const losses = {
+      totalLoss: totalLoss.dataSync()[0],
+      noObjectLoss: noObjectLoss.dataSync()[0],
+      objectLoss: objectLoss.dataSync()[0],
+      coordLoss: coordLoss.dataSync()[0]
+    }
+
+    const lossKey = `${filename}_${inputSize}`
+
+    if (window.logTrainSteps) {
+      log(`ground truth boxes: ${groundTruthBoxes.length}`)
+      log(`noObjectLoss[${dataIdx}]: ${losses.noObjectLoss}`)
+      log(`objectLoss[${dataIdx}]: ${losses.objectLoss}`)
+      log(`coordLoss[${dataIdx}]: ${losses.coordLoss}`)
+      log(`totalLoss[${dataIdx}]: ${losses.totalLoss}`)
+
+      if (window.lossMap[lossKey]) {
+        log(`loss change: ${losses.totalLoss - window.lossMap[lossKey].totalLoss}`)
+      }
+    }
+
+    window.lossMap[lossKey] = losses
+
+    return totalLoss
+  }, true)
+}
+
+async function trainStep(batchCreators, inputSizes, rescaleEveryNthBatch, onBatchProcessed = () => {}) {
+
+  async function step(currentBatchCreators) {
+    if (!currentBatchCreators.batchCreators.length) {
+      return
+    }
+
+    await promiseSequential(inputSizes.map(inputSize => async () => {
+
+      await promiseSequential(currentBatchCreators.batchCreators.map(batchCreator => async () => {
+
+        const batch = await batchCreator()
+        const { imgs, groundTruthBoxes, filenames, dataIdx } = batch
+
+        const img = imgs[0]
+        const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
+        const squareImg = imageToSquare(img)
+
+        const batchInput = await faceapi.toNetInput(squareImg)
+
+        const [imgHeight, imgWidth] = batchInput.inputs[batchIdx].shape
+
+        // skip groundTruthBoxes, which are too tiny
+        const scaleFactor = inputSize / Math.max(imgHeight, imgWidth)
+        const filteredGroundTruthBoxes = groundTruthBoxes[batchIdx].filter(({ x, y, width, height }) => {
+          const box = (new faceapi.Rect(x, y, width, height))
+            .toBoundingBox()
+            .rescale({ height: imgHeight, width: imgWidth })
+            .rescale(scaleFactor)
+
+          const isTooTiny = box.width < 40 || box.height < 40
+          if (isTooTiny && window.debug) {
+            log(`skipping box for input size ${inputSize}: (${Math.floor(box.width)} x ${Math.floor(box.height)})`)
+          }
+          return !isTooTiny
+        })
+
+        if (!filteredGroundTruthBoxes.length) {
+          if (window.debug) {
+            log(`no boxes for input size ${inputSize}, ${groundTruthBoxes[batchIdx].length} boxes were too small`)
+          }
+          batchInput.dispose()
+          onBatchProcessed(dataIdx, inputSize)
+          return
+        }
+
+        let ts = Date.now()
+        const loss = minimize(filteredGroundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings })
+
+        ts = Date.now() - ts
+        if (window.logTrainSteps) {
+          log(`trainStep time for dataIdx ${dataIdx} (${inputSize}): ${ts} ms`)
+        }
+        loss.dispose()
+        batchInput.dispose()
+        onBatchProcessed(dataIdx, inputSize)
+
+        await tf.nextFrame()
+      }))
+
+    }))
+
+    await step(currentBatchCreators.next(rescaleEveryNthBatch))
+  }
+
+  await step(batchCreators.next(rescaleEveryNthBatch))
+}
+
+async function fetchGroundTruthBoxesForFile(file) {
+  const boxes = await fetch(file).then(res => res.json())
+  return {
+    file,
+    boxes
+  }
+}
+
+function createBatchCreators(detectionFilenames, batchSize) {
+  if (batchSize < 1) {
+    throw new Error('invalid batch size: ' + batchSize)
+  }
+
+  const batches = []
+  const pushToBatch = (remaining) => {
+    if (remaining.length) {
+      batches.push(remaining.slice(0, batchSize))
+      pushToBatch(remaining.slice(batchSize))
+    }
+    return batches
+  }
+
+  pushToBatch(detectionFilenames)
+
+  const batchCreators = batches.map((filenamesForBatch, dataIdx) => async () => {
+    const groundTruthBoxes = (await Promise.all(filenamesForBatch.map(fetchGroundTruthBoxesForFile)))
+      .map(({ boxes }) => boxes)
+
+    const imgs = await Promise.all(filenamesForBatch.map(
+      async file => await faceapi.bufferToImage(await fetchImage(file.replace('.json', '')))
+    ))
+
+    return {
+      imgs,
+      groundTruthBoxes,
+      filenames: filenamesForBatch,
+      dataIdx
+    }
+  })
+
+  let idx = 0
+  function next(n) {
+    const nextBatchCreators = batchCreators.slice(idx, idx + n)
+    idx += n
+
+    return {
+      batchCreators: nextBatchCreators,
+      next
+    }
+  }
+
+  return {
+    data: batchCreators,
+    next
+  }
+}
\ No newline at end of file
--- a/tools/train/tinyYolov2/verify.html
+++ b/tools/train/tinyYolov2/verify.html
+<!DOCTYPE html>
+<html>
+<head>
+  <script src="face-api.js"></script>
+  <script src="commons.js"></script>
+  <link rel="stylesheet" href="styles.css">
+  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
+  <script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
+</head>
+<body>
+  <div id="navbar"></div>
+  <div class="center-content page-container">
+    <div class="progress" id="loader">
+      <div class="indeterminate"></div>
+    </div>
+    <div style="position: relative" class="margin">
+      <img id="inputImg" src="" style="max-width: 800px;" />
+      <canvas id="overlay" />
+    </div>
+    <div class="row side-by-side">
+      <div id="selectList"></div>
+      <div class="row">
+        <label for="imgUrlInput">Get image from URL:</label>
+        <input id="imgUrlInput" type="text" class="bold">
+      </div>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="loadImageFromUrl()"
+      >
+        Ok
+      </button>
+    </div>
+    <div class="row side-by-side">
+      <div class="row input-field" style="margin-right: 20px;">
+        <select id="sizeType">
+          <option value="128">128 x 128</option>
+          <option value="160">160 x 160</option>
+          <option value="224">224 x 224</option>
+          <option value="320">320 x 320</option>
+          <option value="416">416 x 416</option>
+          <option value="608">608 x 608</option>
+        </select>
+        <label>Input Size</label>
+      </div>
+      <div class="row">
+        <label for="scoreThreshold">Score Threshold:</label>
+        <input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
+      </div>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="onDecreaseThreshold()"
+      >
+        <i class="material-icons left">-</i>
+      </button>
+      <button
+        class="waves-effect waves-light btn"
+        onclick="onIncreaseThreshold()"
+      >
+        <i class="material-icons left">+</i>
+      </button>
+    </div>
+    <div class="row">
+      <label for="imgByNr">Enter image NR: </label>
+      <input id="imgByNr" type="text" class="bold">
+    </div>
+  </div>
+
+  <script>
+    let scoreThreshold = 0.5
+    let sizeType = 608
+
+    function onKeyDown(e) {
+      e.target.value = (
+        parseInt(e.target.value) + (e.keyCode === 38 ? 1 : (e.keyCode === 40 ? -1 : 0))
+      ) || e.target.value || 0
+
+
+      const imgUri = window.imgs[e.target.value]
+
+      console.log(imgUri)
+      onSelectionChanged(imgUri)
+    }
+
+    function onIncreaseThreshold() {
+      scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
+      $('#scoreThreshold').val(scoreThreshold)
+      updateResults()
+    }
+
+    function onDecreaseThreshold() {
+      scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
+      $('#scoreThreshold').val(scoreThreshold)
+      updateResults()
+    }
+
+    function onSizeTypeChanged(e, c) {
+      sizeType = e.target.value
+      $('#sizeType').val(sizeType)
+      updateResults()
+    }
+
+    async function loadImageFromUrl(url) {
+      const img = await requestExternalImage($('#imgUrlInput').val())
+      $('#inputImg').get(0).src = img.src
+      updateResults()
+    }
+
+    async function updateResults() {
+      const inputImgEl = $('#inputImg').get(0)
+      const { width, height } = inputImgEl
+      const canvas = $('#overlay').get(0)
+      canvas.width = width
+      canvas.height = height
+
+      const forwardParams = {
+        inputSize: parseInt(sizeType),
+        scoreThreshold
+      }
+
+      const detections = await window.net.locateFaces(inputImgEl, forwardParams)
+
+      faceapi.drawDetection('overlay', detections.map(det => det.forSize(width, height)))
+    }
+
+    async function onSelectionChanged(uri) {
+      const imgBuf = await fetchImage(uri)
+      $(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
+      updateResults()
+    }
+
+    async function loadNetWeights(uri) {
+      return new Float32Array(await (await fetch(uri)).arrayBuffer())
+    }
+
+    async function fetchDetectionFilenames() {
+      return fetch('/detection_filenames').then(res => res.json())
+    }
+
+    async function run() {
+      $('#imgByNr').keydown(onKeyDown)
+
+      const startIdx224 = 35060
+      const startIdx320 = 41188
+      const startIdx416 = 31050
+      const startIdx608 = 16520
+
+      //const weightsUrl = `/tmp/tmp__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
+      const weightsUrl = `/tmp/tmp_multiscale_count_4200.weights`
+
+      const weights = await loadNetWeights(weightsUrl)
+      window.net = new faceapi.TinyYolov2(true)
+      await window.net.load(weights)
+
+      window.imgs = (await fetchDetectionFilenames()).map(f => f.replace('.json', ''))
+
+      $('#loader').hide()
+      onSelectionChanged($('#selectList select').val())
+    }
+
+    $(document).ready(function() {
+      renderNavBar('#navbar', 'tiny_yolov2_face_detection')
+      renderImageSelectList(
+        '#selectList',
+        async (uri) => {
+          await onSelectionChanged(uri)
+        },
+        'bbt1.jpg'
+      )
+
+      const sizeTypeSelect = $('#sizeType')
+      sizeTypeSelect.val(sizeType)
+      sizeTypeSelect.on('change', onSizeTypeChanged)
+      sizeTypeSelect.material_select()
+      run()
+    })
+  </script>
+</body>
+</html>
\ No newline at end of file
--- a/tools/train/tsconfig.test.json
+++ b/tools/train/tsconfig.test.json
+{
+  "extends": "../../tsconfig.json",
+  "include": [
+    "tinyYolov2"
+  ]
+}
--- a/weights/tiny_yolov2_model-shard1
+++ b/weights/tiny_yolov2_model-shard1
--- a/weights/tiny_yolov2_model-shard2
+++ b/weights/tiny_yolov2_model-shard2
--- a/weights/tiny_yolov2_model-shard3
+++ b/weights/tiny_yolov2_model-shard3
--- a/weights/tiny_yolov2_model-shard4
+++ b/weights/tiny_yolov2_model-shard4
--- a/weights/tiny_yolov2_model-weights_manifest.json
+++ b/weights/tiny_yolov2_model-weights_manifest.json
-[{"weights":[{"name":"conv0/conv/filters","shape":[3,3,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0036540280370151294,"min":-0.44213739247883066}},{"name":"conv0/conv/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002164303555208094,"min":-0.13418682042290184}},{"name":"conv0/bn/sub","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0028029036288167914,"min":-0.3139252064274806}},{"name":"conv0/bn/truediv","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.05010388411727606,"min":2.7867696285247803}},{"name":"conv1/conv/filters","shape":[3,3,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0017398431020624497,"min":-0.22617960326811845}},{"name":"conv1/conv/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001145745112615473,"min":-0.12488621727508656}},{"name":"conv1/bn/sub","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00586619587505565,"min":-0.9503237317590153}},{"name":"conv1/bn/truediv","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.007056442896525065,"min":1.2661800384521484}},{"name":"conv2/conv/filters","shape":[3,3,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0010545693776186774,"min":-0.12971203344709734}},{"name":"conv2/conv/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0005329197820495157,"min":-0.07034541123053607}},{"name":"conv2/bn/sub","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.005863590567719703,"min":-0.691903686990925}},{"name":"conv2/bn/truediv","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.006374212339812634,"min":1.4001796245574951}},{"name":"conv3/conv/filters","shape":[3,3,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006935241175632851,"min":-0.08114232175490436}},{"name":"conv3/conv/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0002854522025468303,"min":-0.04224692597693088}},{"name":"conv3/bn/sub","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008516784275279325,"min":-1.1667994457132675}},{"name":"conv3/bn/truediv","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004096750652088838,"min":1.1394747495651245}},{"name":"conv4/conv/filters","shape":[3,3,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006537227946169236,"min":-0.06733344784554313}},{"name":"conv4/conv/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00018004616905076831,"min":-0.041770711219778246}},{"name":"conv4/bn/sub","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011157989969440535,"min":-1.5509606057522343}},{"name":"conv4/bn/truediv","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00421752836190018,"min":1.1967103481292725}},{"name":"conv5/conv/filters","shape":[3,3,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0008280676077393925,"min":-0.05299632689532112}},{"name":"conv5/conv/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00007708504312105623,"min":-0.020973851904273033}},{"name":"conv5/bn/sub","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011612189049814262,"min":-1.7302161684223252}},{"name":"conv5/bn/truediv","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0034869993434232826,"min":1.2963157892227173}},{"name":"conv6/conv/filters","shape":[3,3,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0003250038945207409,"min":-0.04030048292057187}},{"name":"conv6/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00009164303220114009,"min":-0.02338002622127533}},{"name":"conv6/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0168181017333386,"min":-2.068626513200648}},{"name":"conv6/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004455030198190726,"min":1.309550166130066}},{"name":"conv7/conv/filters","shape":[3,3,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00026194834533859704,"min":-0.033267439858001825}},{"name":"conv7/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0001491123554753322,"min":-0.011928988438026577}},{"name":"conv7/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009263983427309523,"min":-1.1765258952683093}},{"name":"conv7/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01123507022857666,"min":0.9360886812210083}},{"name":"conv8/filters","shape":[1,1,1024,30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001922343233052422,"min":-0.2614386796951294}},{"name":"conv8/bias","shape":[30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0009638834233377494,"min":-0.15904076485072866}}],"paths":["tiny_yolov2_model-shard1","tiny_yolov2_model-shard2","tiny_yolov2_model-shard3","tiny_yolov2_model-shard4"]}]
\ No newline at end of file
--- a/weights/tiny_yolov2_separable_conv_model-shard1
+++ b/weights/tiny_yolov2_separable_conv_model-shard1
--- a/weights/tiny_yolov2_separable_conv_model-weights_manifest.json
+++ b/weights/tiny_yolov2_separable_conv_model-weights_manifest.json
+[{"weights":[{"name":"conv0/depthwise_filter","shape":[3,3,3,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004699238725737029,"min":-0.7471789573921876}},{"name":"conv0/pointwise_filter","shape":[1,1,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008118405529097015,"min":-1.071629529840806}},{"name":"conv0/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0024678509609372006,"min":-0.28873856242965246}},{"name":"conv1/depthwise_filter","shape":[3,3,16,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004553892331964829,"min":-0.5737904338275684}},{"name":"conv1/pointwise_filter","shape":[1,1,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00980057996862075,"min":-1.3230782957638012}},{"name":"conv1/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0011220066278588537,"min":-0.20644921952602907}},{"name":"conv2/depthwise_filter","shape":[3,3,32,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0032098570290733787,"min":-0.38839270051787883}},{"name":"conv2/pointwise_filter","shape":[1,1,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008682825051101984,"min":-1.154815731796564}},{"name":"conv2/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0015120926440930834,"min":-0.21471715546121783}},{"name":"conv3/depthwise_filter","shape":[3,3,64,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003597520496331009,"min":-0.4317024595597211}},{"name":"conv3/pointwise_filter","shape":[1,1,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010341314240997913,"min":-1.3650534798117246}},{"name":"conv3/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002109630785736383,"min":-0.4113780032185947}},{"name":"conv4/depthwise_filter","shape":[3,3,128,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004783747476689955,"min":-0.6171034244930043}},{"name":"conv4/pointwise_filter","shape":[1,1,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009566552498761345,"min":-1.2627849298364977}},{"name":"conv4/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0020002245903015135,"min":-0.3860433459281921}},{"name":"conv5/depthwise_filter","shape":[3,3,256,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004355777244941861,"min":-0.4791354969436047}},{"name":"conv5/pointwise_filter","shape":[1,1,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010036561068366555,"min":-1.2545701335458193}},{"name":"conv5/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0023248311935686597,"min":-0.42776893961663337}},{"name":"conv6/depthwise_filter","shape":[3,3,512,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004659063442080629,"min":-0.5963601205863205}},{"name":"conv6/pointwise_filter","shape":[1,1,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010061494509379069,"min":-1.2576868136723836}},{"name":"conv6/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0029680932269376867,"min":-0.3947563991827123}},{"name":"conv7/depthwise_filter","shape":[3,3,1024,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003887363508635876,"min":-0.48980780208812036}},{"name":"conv7/pointwise_filter","shape":[1,1,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009973861189449535,"min":-1.2766542322495404}},{"name":"conv7/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004667898486642276,"min":-0.6955168745096991}},{"name":"conv8/filters","shape":[1,1,1024,25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01573958116419175,"min":-2.5340725674348716}},{"name":"conv8/bias","shape":[25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009396760662396749,"min":-2.2552225589752197}}],"paths":["tiny_yolov2_separable_conv_model-shard1"]}]
\ No newline at end of file