Unverified Commit 5fefdcd5 by justadudewhohacks Committed by GitHub

Merge pull request #72 from justadudewhohacks/tiny-yolov2-seperable-conv2d

Tiny yolov2 seperable conv2d
parents 661f228b 4b4ecdb0
......@@ -2,4 +2,6 @@ node_modules
.rpt2_cache
.env*
tmp
proto
weights_uncompressed
weights_unused
\ No newline at end of file
......@@ -5,5 +5,6 @@ examples
proto
weights
weights_uncompressed
weights_unused
test
tools
\ No newline at end of file
......@@ -27,7 +27,7 @@ Table of Contents:
* **[Face Detection & 5 Point Face Landmarks - MTCNN](#usage-face-detection-mtcnn)**
* **[Face Recognition](#usage-face-recognition)**
* **[68 Point Face Landmark Detection](#usage-face-landmark-detection)**
* **[Full Face Detection and Recognition Pipeline](#usage-full-face-detection-and-recognition-pipeline)**
* **[Shortcut Functions for Full Face Description](#shortcut-functions)**
## Examples
......@@ -89,15 +89,15 @@ The face detection model has been trained on the [WIDERFACE dataset](http://mmla
### Face Detection - Tiny Yolo v2
The Tiny Yolo v2 based face detector can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the model is not as accurate as SSD Mobilenet v1 but can achieve faster inference for lower image sizes.
The Tiny Yolo v2 implementation is a very performant face detector, which can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the models ability to locate smaller face bounding boxes is not as accurate as SSD Mobilenet v1.
The Tiny Yolo v2 implementation is still experimental, meaning there is room for optimization (future work). The trained model weights are provided in the [azFace](https://github.com/azmathmoosa/azFace) project.
The face detector has been trained on a custom dataset of ~10K images labeled with bounding boxes and uses depthwise separable convolutions instead of regular convolutions, which ensures very fast inference and allows to have a quantized model size of only 1.7MB making the model extremely mobile and web friendly. Thus, the Tiny Yolo v2 face detector should be your GO-TO face detector on mobile devices.
<a name="about-face-detection-mtcnn"></a>
### Face Detection & 5 Point Face Landmarks - MTCNN
MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration and is able to achieve much lower processing times. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with 2MB, the size of the weights file is only a third of the size of the quantized SSD Mobilenet v1 model (~6MB).
MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration. By tuning the input parameters, MTCNN is able to detect a wide range of face bounding box sizes. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with the model size is only 2MB.
MTCNN has been presented in the paper [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf) by Zhang et al. and the model weights are provided in the official [repo](https://github.com/kpzhang93/MTCNN_face_detection_alignment) of the MTCNN implementation.
......@@ -164,7 +164,7 @@ await net.load('/models/face_detection_model-weights_manifest.json')
// await net.load('/models/face_landmark_68_model-weights_manifest.json')
// await net.load('/models/face_recognition_model-weights_manifest.json')
// await net.load('/models/mtcnn_model-weights_manifest.json')
// await net.load('/models/tiny_yolov2_model-weights_manifest.json')
// await net.load('/models/tiny_yolov2_separable_conv_model-weights_manifest.json')
// or simply load all models
await net.load('/models')
......@@ -197,7 +197,7 @@ const maxResults = 10
// inputs can be html canvas, img or video element or their ids ...
const myImg = document.getElementById('myImg')
const detections = await faceapi.locateFaces(myImg, minConfidence, maxResults)
const detections = await faceapi.ssdMobilenetv1(myImg, minConfidence, maxResults)
```
Draw the detected faces to a canvas:
......@@ -356,7 +356,7 @@ const rightEyeBrow = landmarks.getRightEyeBrow()
Compute the Face Landmarks for Detected Faces:
``` javascript
const detections = await faceapi.locateFaces(input)
const detections = await faceapi.ssdMobilenetv1(input)
// get the face tensors from the image (have to be disposed manually)
const faceTensors = await faceapi.extractFaceTensors(input, detections)
......@@ -366,50 +366,35 @@ const landmarksByFace = await Promise.all(faceTensors.map(t => faceapi.detectLan
faceTensors.forEach(t => t.dispose())
```
<a name="usage-full-face-detection-and-recognition-pipeline"></a>
<a name="shortcut-functions"></a>
### Full Face Detection and Recognition Pipeline
### Shortcut Functions for Full Face Description
After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows:
After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood by providing convenient shortcut functions. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows.
``` javascript
const fullFaceDescriptions = await faceapi.allFaces(input, minConfidence)
const fullFaceDescription0 = fullFaceDescriptions[0]
console.log(fullFaceDescription0.detection) // bounding box & score
console.log(fullFaceDescription0.landmarks) // 68 point face landmarks
console.log(fullFaceDescription0.descriptor) // face descriptor
Using the SSD Mobilenet v1 face detector + 68 point face landmark detector:
``` javascript
const fullFaceDescriptions = await faceapi.allFacesSsdMobilenetv1(input, minConfidence)
```
You can also do everything manually as shown in the following:
Using the Tiny Yolo v2 face detector + 68 point face landmark detector:
``` javascript
// first detect the face locations
const detections = await faceapi.locateFaces(input, minConfidence)
// get the face tensors from the image (have to be disposed manually)
const faceTensors = (await faceapi.extractFaceTensors(input, detections))
// detect landmarks and get the aligned face image bounding boxes
const alignedFaceBoxes = await Promise.all(faceTensors.map(
async (faceTensor, i) => {
const faceLandmarks = await faceapi.detectLandmarks(faceTensor)
return faceLandmarks.align(detections[i])
}
))
const fullFaceDescriptions = await faceapi.allFacesTinyYolov2(input, { inputSize: 'md' })
```
// free memory for face image tensors after we detected the face landmarks
faceTensors.forEach(t => t.dispose())
Or with MTCNN face detection + 5 point face landmarks:
// get the face tensors for the aligned face images from the image (have to be disposed manually)
const alignedFaceTensors = (await faceapi.extractFaceTensors(input, alignedFaceBoxes))
``` javascript
const fullFaceDescriptions = await faceapi.allFacesMtcnn(input, { minFaceSize: 20 })
```
// compute the face descriptors from the aligned face images
const descriptors = await Promise.all(alignedFaceTensors.map(
faceTensor => faceapi.computeFaceDescriptor(faceTensor)
))
The shortcut functions return an array of FullFaceDescriptions:
// free memory for face image tensors after we computed their descriptors
alignedFaceTensors.forEach(t => t.dispose())
``` javascript
const fullFaceDescription0 = fullFaceDescriptions[0]
console.log(fullFaceDescription0.detection) // bounding box & score
console.log(fullFaceDescription0.landmarks) // face landmarks
console.log(fullFaceDescription0.descriptor) // face descriptor
```
\ No newline at end of file
......@@ -147,6 +147,10 @@ function renderNavBar(navbarId, exampleUri) {
name: 'Tiny Yolov2 Face Detection Webcam'
},
{
uri: 'tiny_yolov2_face_recognition',
name: 'Tiny Yolov2 Face Recognition'
},
{
uri: 'batch_face_landmarks',
name: 'Batch Face Landmarks'
},
......
......@@ -32,6 +32,7 @@ app.get('/mtcnn_face_recognition_webcam', (req, res) => res.sendFile(path.join(v
app.get('/tiny_yolov2_face_detection', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetection.html')))
app.get('/tiny_yolov2_face_detection_video', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionVideo.html')))
app.get('/tiny_yolov2_face_detection_webcam', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionWebcam.html')))
app.get('/tiny_yolov2_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceRecognition.html')))
app.get('/batch_face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceLandmarks.html')))
app.get('/batch_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceRecognition.html')))
......
......@@ -53,6 +53,15 @@
let modelLoaded = false
let result
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() {
minConfidence = Math.min(faceapi.round(minConfidence + 0.1), 1.0)
$('#minConfidence').val(minConfidence)
......@@ -63,11 +72,6 @@
$('#minConfidence').val(minConfidence)
}
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded)
return false
......@@ -79,7 +83,7 @@
const ts = Date.now()
result = await faceapi.locateFaces(videoEl, minConfidence)
displayTimeStats(Date.now() - ts)
updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl))
......
......@@ -63,6 +63,15 @@
let sizeType = 'md'
let modelLoaded = false
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
......@@ -78,11 +87,6 @@
$('#sizeType').val(sizeType)
}
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded)
return false
......@@ -99,7 +103,7 @@
const ts = Date.now()
result = await faceapi.tinyYolov2(videoEl, forwardParams)
displayTimeStats(Date.now() - ts)
updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl))
......
......@@ -64,6 +64,15 @@
let sizeType = '160'
let modelLoaded = false
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
......@@ -79,11 +88,6 @@
$('#sizeType').val(sizeType)
}
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded)
return false
......@@ -100,7 +104,7 @@
const ts = Date.now()
result = await faceapi.tinyYolov2(videoEl, forwardParams)
displayTimeStats(Date.now() - ts)
updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl))
......@@ -124,7 +128,7 @@
}
$(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_detection_video')
renderNavBar('#navbar', 'tiny_yolov2_face_detection_webcam')
const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType)
......
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<div id="navbar"></div>
<div class="center-content page-container">
<div class="progress" id="loader">
<div class="indeterminate"></div>
</div>
<div style="position: relative" class="margin">
<img id="inputImg" src="" style="max-width: 800px;" />
<canvas id="overlay" />
</div>
<div class="row side-by-side">
<div id="selectList"></div>
<div class="row">
<label for="imgUrlInput">Get image from URL:</label>
<input id="imgUrlInput" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="loadImageFromUrl()"
>
Ok
</button>
<p>
<input type="checkbox" id="useBatchProcessing" onchange="onChangeUseBatchProcessing(event)" />
<label for="useBatchProcessing">Use Batch Processing</label>
</p>
</div>
<div class="row side-by-side">
<div class="row input-field" style="margin-right: 20px;">
<select id="sizeType">
<option value="" disabled selected>Input Size:</option>
<option value="xs">XS: 224 x 224</option>
<option value="sm">SM: 320 x 320</option>
<option value="md">MD: 416 x 416</option>
<option value="lg">LG: 608 x 608</option>
</select>
<label>Input Size</label>
</div>
<div class="row">
<label for="scoreThreshold">Score Threshold:</label>
<input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="onDecreaseThreshold()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn"
onclick="onIncreaseThreshold()"
>
<i class="material-icons left">+</i>
</button>
</div>
<div class="row side-by-side">
<div class="row">
<label for="maxDistance">Max Descriptor Distance:</label>
<input disabled value="0.6" id="maxDistance" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn button-sm"
onclick="onDecreaseMaxDistance()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn button-sm"
onclick="onIncreaseMaxDistance()"
>
<i class="material-icons left">+</i>
</button>
</div>
</div>
<script>
let maxDistance = 0.6
let useBatchProcessing = false
let trainDescriptorsByClass = []
let scoreThreshold = 0.5
let sizeType = 'lg'
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onDecreaseThreshold() {
scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onSizeTypeChanged(e, c) {
sizeType = e.target.value
$('#sizeType').val(sizeType)
updateResults()
}
function onChangeUseBatchProcessing(e) {
useBatchProcessing = $(e.target).prop('checked')
}
function onIncreaseMaxDistance() {
maxDistance = Math.min(faceapi.round(maxDistance + 0.1), 1.0)
$('#maxDistance').val(maxDistance)
updateResults()
}
function onDecreaseMaxDistance() {
maxDistance = Math.max(faceapi.round(maxDistance - 0.1), 0.1)
$('#maxDistance').val(maxDistance)
updateResults()
}
async function loadImageFromUrl(url) {
const img = await requestExternalImage($('#imgUrlInput').val())
$('#inputImg').get(0).src = img.src
updateResults()
}
async function updateResults() {
const inputImgEl = $('#inputImg').get(0)
const { width, height } = inputImgEl
const canvas = $('#overlay').get(0)
canvas.width = width
canvas.height = height
const forwardParams = {
inputSize: sizeType,
scoreThreshold
}
const fullFaceDescriptions = (await faceapi.allFacesTinyYolov2(inputImgEl, forwardParams, useBatchProcessing))
.map(fd => fd.forSize(width, height))
fullFaceDescriptions.forEach(({ detection, descriptor }) => {
faceapi.drawDetection('overlay', [detection], { withScore: false })
const bestMatch = getBestMatch(trainDescriptorsByClass, descriptor)
const text = `${bestMatch.distance < maxDistance ? bestMatch.className : 'unkown'} (${bestMatch.distance})`
const { x, y, height: boxHeight } = detection.getBox()
faceapi.drawText(
canvas.getContext('2d'),
x,
y + boxHeight,
text,
Object.assign(faceapi.getDefaultDrawOptions(), { color: 'red', fontSize: 16 })
)
})
}
async function onSelectionChanged(uri) {
const imgBuf = await fetchImage(uri)
$(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
updateResults()
}
async function run() {
await faceapi.loadTinyYolov2Model('/')
await faceapi.loadFaceLandmarkModel('/')
await faceapi.loadFaceRecognitionModel('/')
trainDescriptorsByClass = await initTrainDescriptorsByClass(faceapi.recognitionNet, 1)
$('#loader').hide()
onSelectionChanged($('#selectList select').val())
}
$(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_recognition')
renderImageSelectList(
'#selectList',
async (uri) => {
await onSelectionChanged(uri)
},
'bbt1.jpg'
)
const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType)
sizeTypeSelect.on('change', onSizeTypeChanged)
sizeTypeSelect.material_select()
run()
})
</script>
</body>
</html>
\ No newline at end of file
......@@ -4,7 +4,8 @@ const dataFiles = [
'test/data/*.json',
'test/media/*.mp4',
'weights/**/*',
'weights_uncompressed/**/*'
'weights_uncompressed/**/*',
'weights_unused/**/*'
].map(pattern => ({
pattern,
watched: false,
......
This source diff could not be displayed because it is too large. You can view the blob instead.
import { Rect } from './Rect';
import { Dimensions } from './types';
import { isDimensions } from './utils';
export class BoundingBox {
constructor(
......@@ -33,6 +35,10 @@ export class BoundingBox {
return this.bottom - this.top
}
public get area() : number {
return this.width * this.height
}
public toSquare(): BoundingBox {
let { left, top, right, bottom } = this
......@@ -100,6 +106,12 @@ export class BoundingBox {
).toSquare().round()
}
public rescale(s: Dimensions | number) {
const scaleX = isDimensions(s) ? (s as Dimensions).width : s as number
const scaleY = isDimensions(s) ? (s as Dimensions).height : s as number
return new BoundingBox(this.left * scaleX, this.top * scaleY, this.right * scaleX, this.bottom * scaleY)
}
public toRect(): Rect {
return new Rect(this.left, this.top, this.width, this.height)
}
......
......@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
import { isTensor3D, isTensor4D } from './commons/isTensor';
import { padToSquare } from './padToSquare';
import { Point } from './Point';
import { TResolvedNetInput } from './types';
import { TResolvedNetInput, Dimensions } from './types';
import { createCanvasFromMedia } from './utils';
export class NetInput {
......@@ -14,6 +14,7 @@ export class NetInput {
private _inputDimensions: number[][] = []
private _paddings: Point[] = []
private _inputSize: number = 0
constructor(
inputs: tf.Tensor4D | Array<TResolvedNetInput>,
......@@ -81,6 +82,22 @@ export class NetInput {
return this._paddings
}
public get inputSize(): number {
return this._inputSize
}
public get relativePaddings(): Point[] {
return Array(this.inputs.length).fill(0).map(
(_, batchIdx) => this.getRelativePaddings(batchIdx)
)
}
public get reshapedInputDimensions(): Dimensions[] {
return Array(this.inputs.length).fill(0).map(
(_, batchIdx) => this.getReshapedInputDimensions(batchIdx)
)
}
public getInputDimensions(batchIdx: number): number[] {
return this._inputDimensions[batchIdx]
}
......@@ -97,8 +114,26 @@ export class NetInput {
return this._paddings[batchIdx]
}
public getRelativePaddings(batchIdx: number): Point {
return new Point(
(this.getPaddings(batchIdx).x + this.getInputWidth(batchIdx)) / this.getInputWidth(batchIdx),
(this.getPaddings(batchIdx).y + this.getInputHeight(batchIdx)) / this.getInputHeight(batchIdx)
)
}
public getReshapedInputDimensions(batchIdx: number): Dimensions {
const [h, w] = [this.getInputHeight(batchIdx), this.getInputWidth(batchIdx)]
const f = this.inputSize / Math.max(h, w)
return {
height: Math.floor(h * f),
width: Math.floor(w * f)
}
}
public toBatchTensor(inputSize: number, isCenterInputs: boolean = true): tf.Tensor4D {
this._inputSize = inputSize
return tf.tidy(() => {
const inputTensors = this._inputs.map((inputTensor: tf.Tensor3D) => {
......@@ -131,7 +166,7 @@ export class NetInput {
}
/**
* By setting the isManaged flag, all newly created tensors will be automatically
* By setting the isManaged flag, all newly created tensors will be
* automatically disposed after the batch tensor has been created
*/
public managed() {
......
import { BoundingBox } from './BoundingBox';
export interface IRect {
x: number
y: number
......@@ -54,6 +55,10 @@ export class Rect implements IRect {
)
}
public toBoundingBox(): BoundingBox {
return new BoundingBox(this.x, this.y, this.x + this.width, this.y + this.height)
}
public clipAtImageBorders(imgWidth: number, imgHeight: number): Rect {
const { x, y, right, bottom } = this
const clippedX = Math.max(x, 0)
......
import { TinyYolov2 } from '.';
import { extractFaceTensors } from './extractFaceTensors';
import { FaceDetection } from './FaceDetection';
import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
import { FaceLandmarks68 } from './faceLandmarkNet/FaceLandmarks68';
......@@ -7,6 +9,7 @@ import { FullFaceDescription } from './FullFaceDescription';
import { Mtcnn } from './mtcnn/Mtcnn';
import { MtcnnForwardParams } from './mtcnn/types';
import { Rect } from './Rect';
import { TinyYolov2ForwardParams } from './tinyYolov2/types';
import { TNetInput } from './types';
function computeDescriptorsFactory(
......@@ -27,8 +30,8 @@ function computeDescriptorsFactory(
}
}
export function allFacesFactory(
detectionNet: FaceDetectionNet,
function allFacesFactory(
detectFaces: (input: TNetInput) => Promise<FaceDetection[]>,
landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet
) {
......@@ -36,11 +39,10 @@ export function allFacesFactory(
return async function(
input: TNetInput,
minConfidence: number = 0.8,
useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> {
const detections = await detectionNet.locateFaces(input, minConfidence)
const detections = await detectFaces(input)
const faceTensors = await extractFaceTensors(input, detections)
const faceLandmarksByFace = useBatchProcessing
......@@ -68,6 +70,38 @@ export function allFacesFactory(
}
}
export function allFacesSsdMobilenetv1Factory(
ssdMobilenetv1: FaceDetectionNet,
landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet
) {
return async function(
input: TNetInput,
minConfidence: number = 0.8,
useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> {
const detectFaces = (input: TNetInput) => ssdMobilenetv1.locateFaces(input, minConfidence)
const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
return allFaces(input, useBatchProcessing)
}
}
export function allFacesTinyYolov2Factory(
tinyYolov2: TinyYolov2,
landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet
) {
return async function(
input: TNetInput,
forwardParams: TinyYolov2ForwardParams = {},
useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> {
const detectFaces = (input: TNetInput) => tinyYolov2.locateFaces(input, forwardParams)
const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
return allFaces(input, useBatchProcessing)
}
}
export function allFacesMtcnnFactory(
mtcnn: Mtcnn,
recognitionNet: FaceRecognitionNet
......
import { BoundingBox } from '../BoundingBox';
import { iou } from '../iou';
export function nonMaxSuppression(
boxes: BoundingBox[],
......@@ -7,10 +8,6 @@ export function nonMaxSuppression(
isIOU: boolean = true
): number[] {
const areas = boxes.map(
box => (box.width + 1) * (box.height + 1)
)
let indicesSortedByScore = scores
.map((score, boxIndex) => ({ score, boxIndex }))
.sort((c1, c2) => c1.score - c2.score)
......@@ -31,15 +28,7 @@ export function nonMaxSuppression(
const currBox = boxes[curr]
const idxBox = boxes[idx]
const width = Math.max(0.0, Math.min(currBox.right, idxBox.right) - Math.max(currBox.left, idxBox.left) + 1)
const height = Math.max(0.0, Math.min(currBox.bottom, idxBox.bottom) - Math.max(currBox.top, idxBox.top) + 1)
const interSection = width * height
const out = isIOU
? interSection / (areas[curr] + areas[idx] - interSection)
: interSection / Math.min(areas[curr], areas[idx])
outputs.push(out)
outputs.push(iou(currBox, idxBox, isIOU))
}
indicesSortedByScore = indicesSortedByScore.filter(
......
import * as tf from '@tensorflow/tfjs-core';
export function normalize(x: tf.Tensor4D, meanRgb: number[]): tf.Tensor4D {
return tf.tidy(() => {
const [r, g, b] = meanRgb
const avg_r = tf.fill([...x.shape.slice(0, 3), 1], r)
const avg_g = tf.fill([...x.shape.slice(0, 3), 1], g)
const avg_b = tf.fill([...x.shape.slice(0, 3), 1], b)
const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
return tf.sub(x, avg_rgb)
})
}
\ No newline at end of file
......@@ -81,19 +81,18 @@ export class FaceDetectionNet extends NeuralNetwork<NetParams> {
minConfidence
)
const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0)
const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
const paddings = netInput.getRelativePaddings(0)
const results = indices
.map(idx => {
const [top, bottom] = [
Math.max(0, boxes.get(idx, 0)),
Math.min(1.0, boxes.get(idx, 2))
].map(val => val * paddedHeightRelative)
].map(val => val * paddings.y)
const [left, right] = [
Math.max(0, boxes.get(idx, 1)),
Math.min(1.0, boxes.get(idx, 3))
].map(val => val * paddedWidthRelative)
].map(val => val * paddings.x)
return new FaceDetection(
scoresData[idx],
new Rect(
......
import * as tf from '@tensorflow/tfjs-core';
import { NeuralNetwork } from '../commons/NeuralNetwork';
import { normalize } from '../commons/normalize';
import { NetInput } from '../NetInput';
import { toNetInput } from '../toNetInput';
import { TNetInput } from '../types';
import { convDown } from './convLayer';
import { extractParams } from './extractParams';
import { loadQuantizedParams } from './loadQuantizedParams';
import { normalize } from './normalize';
import { residual, residualDown } from './residualLayer';
import { NetParams } from './types';
......@@ -28,7 +28,8 @@ export class FaceRecognitionNet extends NeuralNetwork<NetParams> {
return tf.tidy(() => {
const batchTensor = input.toBatchTensor(150, true)
const normalized = normalize(batchTensor)
const meanRgb = [122.782, 117.001, 104.298]
const normalized = normalize(batchTensor, meanRgb).div(tf.scalar(256)) as tf.Tensor4D
let out = convDown(normalized, params.conv32_down)
out = tf.maxPool(out, 3, 2, 'valid')
......
import * as tf from '@tensorflow/tfjs-core';
export function normalize(x: tf.Tensor4D): tf.Tensor4D {
return tf.tidy(() => {
const avg_r = tf.fill([...x.shape.slice(0, 3), 1], 122.782);
const avg_g = tf.fill([...x.shape.slice(0, 3), 1], 117.001);
const avg_b = tf.fill([...x.shape.slice(0, 3), 1], 104.298);
const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
return tf.div(tf.sub(x, avg_rgb), tf.scalar(256))
})
}
\ No newline at end of file
import * as tf from '@tensorflow/tfjs-core';
import { allFacesFactory, allFacesMtcnnFactory } from './allFacesFactory';
import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from './allFacesFactory';
import { FaceDetection } from './FaceDetection';
import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
......@@ -21,15 +21,15 @@ export const recognitionNet = new FaceRecognitionNet()
// nets need more specific names, to avoid ambiguity in future
// when alternative net implementations are provided
export const nets = {
ssdMobilenet: detectionNet,
ssdMobilenetv1: detectionNet,
faceLandmark68Net: landmarkNet,
faceRecognitionNet: recognitionNet,
mtcnn: new Mtcnn(),
tinyYolov2: new TinyYolov2()
}
export function loadFaceDetectionModel(url: string) {
return nets.ssdMobilenet.load(url)
export function loadSsdMobilenetv1Model(url: string) {
return nets.ssdMobilenetv1.load(url)
}
export function loadFaceLandmarkModel(url: string) {
......@@ -48,9 +48,13 @@ export function loadTinyYolov2Model(url: string) {
return nets.tinyYolov2.load(url)
}
export function loadFaceDetectionModel(url: string) {
return loadSsdMobilenetv1Model(url)
}
export function loadModels(url: string) {
return Promise.all([
loadFaceDetectionModel(url),
loadSsdMobilenetv1Model(url),
loadFaceLandmarkModel(url),
loadFaceRecognitionModel(url),
loadMtcnnModel(url),
......@@ -63,7 +67,7 @@ export function locateFaces(
minConfidence?: number,
maxResults?: number
): Promise<FaceDetection[]> {
return nets.ssdMobilenet.locateFaces(input, minConfidence, maxResults)
return nets.ssdMobilenetv1.locateFaces(input, minConfidence, maxResults)
}
export function detectLandmarks(
......@@ -92,14 +96,26 @@ export function tinyYolov2(
return nets.tinyYolov2.locateFaces(input, forwardParams)
}
export type allFacesFunction = (
export type allFacesSsdMobilenetv1Function = (
input: tf.Tensor | NetInput | TNetInput,
minConfidence?: number,
useBatchProcessing?: boolean
) => Promise<FullFaceDescription[]>
export const allFaces: allFacesFunction = allFacesFactory(
nets.ssdMobilenet,
export const allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function = allFacesSsdMobilenetv1Factory(
nets.ssdMobilenetv1,
nets.faceLandmark68Net,
nets.faceRecognitionNet
)
export type allFacesTinyYolov2Function = (
input: tf.Tensor | NetInput | TNetInput,
forwardParams?: TinyYolov2ForwardParams,
useBatchProcessing?: boolean
) => Promise<FullFaceDescription[]>
export const allFacesTinyYolov2: allFacesTinyYolov2Function = allFacesTinyYolov2Factory(
nets.tinyYolov2,
nets.faceLandmark68Net,
nets.faceRecognitionNet
)
......@@ -114,3 +130,5 @@ export const allFacesMtcnn: allFacesMtcnnFunction = allFacesMtcnnFactory(
nets.mtcnn,
nets.faceRecognitionNet
)
export const allFaces = allFacesSsdMobilenetv1
\ No newline at end of file
......@@ -5,6 +5,7 @@ export {
}
export * from './BoundingBox';
export * from './FaceDetection';
export * from './FullFaceDescription';
export * from './NetInput';
......@@ -19,8 +20,11 @@ export * from './faceDetectionNet';
export * from './faceLandmarkNet';
export * from './faceRecognitionNet';
export * from './globalApi';
export * from './iou';
export * from './mtcnn';
export * from './padToSquare';
export * from './tinyYolov2';
export * from './toNetInput';
export * from './utils'
\ No newline at end of file
export * from './utils';
export * from './types';
\ No newline at end of file
import { BoundingBox } from './BoundingBox';
export function iou(box1: BoundingBox, box2: BoundingBox, isIOU: boolean = true) {
const width = Math.max(0.0, Math.min(box1.right, box2.right) - Math.max(box1.left, box2.left) + 1)
const height = Math.max(0.0, Math.min(box1.bottom, box2.bottom) - Math.max(box1.top, box2.top) + 1)
const interSection = width * height
return isIOU
? interSection / (box1.area + box2.area - interSection)
: interSection / Math.min(box1.area, box2.area)
}
\ No newline at end of file
......@@ -4,21 +4,37 @@ import { BoundingBox } from '../BoundingBox';
import { convLayer } from '../commons/convLayer';
import { NeuralNetwork } from '../commons/NeuralNetwork';
import { nonMaxSuppression } from '../commons/nonMaxSuppression';
import { normalize } from '../commons/normalize';
import { FaceDetection } from '../FaceDetection';
import { NetInput } from '../NetInput';
import { Point } from '../Point';
import { toNetInput } from '../toNetInput';
import { TNetInput } from '../types';
import { BOX_ANCHORS, INPUT_SIZES, IOU_THRESHOLD, NUM_BOXES } from './config';
import { sigmoid } from '../utils';
import { BOX_ANCHORS, BOX_ANCHORS_SEPARABLE, INPUT_SIZES, IOU_THRESHOLD, MEAN_RGB, NUM_BOXES } from './config';
import { convWithBatchNorm } from './convWithBatchNorm';
import { extractParams } from './extractParams';
import { getDefaultParams } from './getDefaultParams';
import { loadQuantizedParams } from './loadQuantizedParams';
import { NetParams, TinyYolov2ForwardParams } from './types';
import { NetParams, PostProcessingParams, TinyYolov2ForwardParams } from './types';
export class TinyYolov2 extends NeuralNetwork<NetParams> {
constructor() {
private _withSeparableConvs: boolean
private _anchors: Point[]
constructor(withSeparableConvs: boolean = true) {
super('TinyYolov2')
this._withSeparableConvs = withSeparableConvs
this._anchors = withSeparableConvs ? BOX_ANCHORS_SEPARABLE : BOX_ANCHORS
}
public get withSeparableConvs(): boolean {
return this._withSeparableConvs
}
public get anchors(): Point[] {
return this._anchors
}
public forwardInput(input: NetInput, inputSize: number): tf.Tensor4D {
......@@ -30,7 +46,12 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
}
const out = tf.tidy(() => {
const batchTensor = input.toBatchTensor(inputSize, false).div(tf.scalar(255)).toFloat() as tf.Tensor4D
let batchTensor = input.toBatchTensor(inputSize, false)
batchTensor = this.withSeparableConvs
? normalize(batchTensor, MEAN_RGB)
: batchTensor
batchTensor = batchTensor.div(tf.scalar(256)) as tf.Tensor4D
let out = convWithBatchNorm(batchTensor, params.conv0)
out = tf.maxPool(out, [2, 2], [2, 2], 'same')
......@@ -67,44 +88,78 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
: _inputSize
if (typeof inputSize !== 'number') {
throw new Error(`TinyYolov2 - unkown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`)
throw new Error(`TinyYolov2 - unknown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`)
}
const netInput = await toNetInput(input, true)
const out = await this.forwardInput(netInput, inputSize)
const numCells = out.shape[1]
const out0 = tf.tidy(() => tf.unstack(out)[0].expandDims()) as tf.Tensor4D
const inputDimensions = {
width: netInput.getInputWidth(0),
height: netInput.getInputHeight(0)
}
const paddings = netInput.getRelativePaddings(0)
const results = this.postProcess(out0, { scoreThreshold, paddings })
const boxes = results.map(res => res.box)
const scores = results.map(res => res.score)
const [boxesTensor, scoresTensor] = tf.tidy(() => {
const reshaped = out.reshape([numCells, numCells, NUM_BOXES, 6])
out.dispose()
out0.dispose()
const indices = nonMaxSuppression(
boxes.map(box => box.rescale(inputSize)),
scores,
IOU_THRESHOLD,
true
)
const detections = indices.map(idx =>
new FaceDetection(
scores[idx],
boxes[idx].toRect(),
inputDimensions
)
)
return detections
}
public postProcess(outputTensor: tf.Tensor4D, { scoreThreshold, paddings }: PostProcessingParams) {
const numCells = outputTensor.shape[1]
const [boxesTensor, scoresTensor] = tf.tidy(() => {
const reshaped = outputTensor.reshape([numCells, numCells, NUM_BOXES, this.withSeparableConvs ? 5 : 6])
const boxes = reshaped.slice([0, 0, 0, 0], [numCells, numCells, NUM_BOXES, 4])
const scores = reshaped.slice([0, 0, 0, 4], [numCells, numCells, NUM_BOXES, 1])
return [boxes, scores]
})
const expit = (x: number): number => 1 / (1 + Math.exp(-x))
const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0)
const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
const boxes: BoundingBox[] = []
const scores: number[] = []
const results = []
for (let row = 0; row < numCells; row ++) {
for (let col = 0; col < numCells; col ++) {
for (let box = 0; box < NUM_BOXES; box ++) {
const score = expit(scoresTensor.get(row, col, box, 0))
if (score > scoreThreshold) {
const ctX = ((col + expit(boxesTensor.get(row, col, box, 0))) / numCells) * paddedWidthRelative
const ctY = ((row + expit(boxesTensor.get(row, col, box, 1))) / numCells) * paddedHeightRelative
const width = ((Math.exp(boxesTensor.get(row, col, box, 2)) * BOX_ANCHORS[box].x) / numCells) * paddedWidthRelative
const height = ((Math.exp(boxesTensor.get(row, col, box, 3)) * BOX_ANCHORS[box].y) / numCells) * paddedHeightRelative
for (let anchor = 0; anchor < NUM_BOXES; anchor ++) {
const score = sigmoid(scoresTensor.get(row, col, anchor, 0))
if (!scoreThreshold || score > scoreThreshold) {
const ctX = ((col + sigmoid(boxesTensor.get(row, col, anchor, 0))) / numCells) * paddings.x
const ctY = ((row + sigmoid(boxesTensor.get(row, col, anchor, 1))) / numCells) * paddings.y
const width = ((Math.exp(boxesTensor.get(row, col, anchor, 2)) * this.anchors[anchor].x) / numCells) * paddings.x
const height = ((Math.exp(boxesTensor.get(row, col, anchor, 3)) * this.anchors[anchor].y) / numCells) * paddings.y
const x = (ctX - (width / 2))
const y = (ctY - (height / 2))
boxes.push(new BoundingBox(x, y, x + width, y + height))
scores.push(score)
results.push({
box: new BoundingBox(x, y, x + width, y + height),
score,
row,
col,
anchor
})
}
}
}
......@@ -113,34 +168,14 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
boxesTensor.dispose()
scoresTensor.dispose()
const indices = nonMaxSuppression(
boxes.map(box => new BoundingBox(
box.left * inputSize,
box.top * inputSize,
box.right * inputSize,
box.bottom * inputSize
)),
scores,
IOU_THRESHOLD,
true
)
const detections = indices.map(idx =>
new FaceDetection(
scores[idx],
boxes[idx].toRect(),
{ width: netInput.getInputWidth(0), height: netInput.getInputHeight(0) }
)
)
return detections
return results
}
protected loadQuantizedParams(uri: string | undefined) {
return loadQuantizedParams(uri)
return loadQuantizedParams(uri, this.withSeparableConvs)
}
protected extractParams(weights: Float32Array) {
return extractParams(weights)
return extractParams(weights, this.withSeparableConvs)
}
}
\ No newline at end of file
......@@ -11,3 +11,13 @@ export const BOX_ANCHORS = [
new Point(10.246, 4.59428),
new Point(12.6868, 11.8741)
]
export const BOX_ANCHORS_SEPARABLE = [
new Point(1.603231, 2.094468),
new Point(6.041143, 7.080126),
new Point(2.882459, 3.518061),
new Point(4.266906, 5.178857),
new Point(9.041765, 10.66308)
]
export const MEAN_RGB = [117.001, 114.697, 97.404]
\ No newline at end of file
import * as tf from '@tensorflow/tfjs-core';
import { leaky } from './leaky';
import { ConvWithBatchNorm } from './types';
import { ConvWithBatchNorm, SeparableConvParams } from './types';
export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm): tf.Tensor4D {
export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm | SeparableConvParams): tf.Tensor4D {
return tf.tidy(() => {
let out = tf.pad(x, [[0, 0], [1, 1], [1, 1], [0, 0]]) as tf.Tensor4D
if (params instanceof SeparableConvParams) {
out = tf.separableConv2d(out, params.depthwise_filter, params.pointwise_filter, [1, 1], 'valid')
out = tf.add(out, params.bias)
} else {
out = tf.conv2d(out, params.conv.filters, [1, 1], 'valid')
out = tf.sub(out, params.bn.sub)
out = tf.mul(out, params.bn.truediv)
out = tf.add(out, params.conv.bias)
}
return leaky(out)
})
}
\ No newline at end of file
......@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
import { extractConvParamsFactory } from '../commons/extractConvParamsFactory';
import { extractWeightsFactory } from '../commons/extractWeightsFactory';
import { ExtractWeightsFunction, ParamMapping } from '../commons/types';
import { BatchNorm, ConvWithBatchNorm, NetParams } from './types';
import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';
function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings: ParamMapping[]) {
......@@ -30,14 +30,33 @@ function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings
return { conv, bn }
}
function extractSeparableConvParams(channelsIn: number, channelsOut: number, mappedPrefix: string): SeparableConvParams {
const depthwise_filter = tf.tensor4d(extractWeights(3 * 3 * channelsIn), [3, 3, channelsIn, 1])
const pointwise_filter = tf.tensor4d(extractWeights(channelsIn * channelsOut), [1, 1, channelsIn, channelsOut])
const bias = tf.tensor1d(extractWeights(channelsOut))
paramMappings.push(
{ paramPath: `${mappedPrefix}/depthwise_filter` },
{ paramPath: `${mappedPrefix}/pointwise_filter` },
{ paramPath: `${mappedPrefix}/bias` }
)
return new SeparableConvParams(
depthwise_filter,
pointwise_filter,
bias
)
}
return {
extractConvParams,
extractConvWithBatchNormParams
extractConvWithBatchNormParams,
extractSeparableConvParams
}
}
export function extractParams(weights: Float32Array): { params: NetParams, paramMappings: ParamMapping[] } {
export function extractParams(weights: Float32Array, withSeparableConvs: boolean): { params: NetParams, paramMappings: ParamMapping[] } {
const {
extractWeights,
......@@ -48,18 +67,22 @@ export function extractParams(weights: Float32Array): { params: NetParams, param
const {
extractConvParams,
extractConvWithBatchNormParams
extractConvWithBatchNormParams,
extractSeparableConvParams
} = extractorsFactory(extractWeights, paramMappings)
const conv0 = extractConvWithBatchNormParams(3, 16, 'conv0')
const conv1 = extractConvWithBatchNormParams(16, 32, 'conv1')
const conv2 = extractConvWithBatchNormParams(32, 64, 'conv2')
const conv3 = extractConvWithBatchNormParams(64, 128, 'conv3')
const conv4 = extractConvWithBatchNormParams(128, 256, 'conv4')
const conv5 = extractConvWithBatchNormParams(256, 512, 'conv5')
const conv6 = extractConvWithBatchNormParams(512, 1024, 'conv6')
const conv7 = extractConvWithBatchNormParams(1024, 1024, 'conv7')
const conv8 = extractConvParams(1024, 30, 1, 'conv8')
const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams
const numAnchorEncodings = withSeparableConvs ? 5 : 6
const conv0 = extractConvFn(3, 16, 'conv0',)
const conv1 = extractConvFn(16, 32, 'conv1')
const conv2 = extractConvFn(32, 64, 'conv2')
const conv3 = extractConvFn(64, 128, 'conv3')
const conv4 = extractConvFn(128, 256, 'conv4')
const conv5 = extractConvFn(256, 512, 'conv5')
const conv6 = extractConvFn(512, 1024, 'conv6')
const conv7 = extractConvFn(1024, 1024, 'conv7')
const conv8 = extractConvParams(1024, 5 * numAnchorEncodings, 1, 'conv8')
if (getRemainingWeights().length !== 0) {
throw new Error(`weights remaing after extract: ${getRemainingWeights().length}`)
......
......@@ -4,7 +4,7 @@ export function getDefaultParams(params: TinyYolov2ForwardParams) {
return Object.assign(
{},
{
sizeType: SizeType.MD,
inputSize: SizeType.MD,
scoreThreshold: 0.5
},
params
......
......@@ -2,8 +2,8 @@ import { TinyYolov2 } from './TinyYolov2';
export * from './TinyYolov2';
export function createTinyYolov2(weights: Float32Array) {
const net = new TinyYolov2()
export function createTinyYolov2(weights: Float32Array, withSeparableConvs: boolean = true) {
const net = new TinyYolov2(withSeparableConvs)
net.extractWeights(weights)
return net
}
\ No newline at end of file
......@@ -2,6 +2,8 @@ import * as tf from '@tensorflow/tfjs-core';
export function leaky(x: tf.Tensor4D): tf.Tensor4D {
return tf.tidy(() => {
return tf.maximum(x, tf.mul(x, tf.scalar(0.10000000149011612)))
const min = tf.mul(x, tf.scalar(0.10000000149011612))
return tf.add(tf.relu(tf.sub(x, min)), min)
//return tf.maximum(x, min)
})
}
\ No newline at end of file
......@@ -4,9 +4,10 @@ import { disposeUnusedWeightTensors } from '../commons/disposeUnusedWeightTensor
import { extractWeightEntryFactory } from '../commons/extractWeightEntryFactory';
import { loadWeightMap } from '../commons/loadWeightMap';
import { ConvParams, ParamMapping } from '../commons/types';
import { BatchNorm, ConvWithBatchNorm, NetParams } from './types';
import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';
const DEFAULT_MODEL_NAME = 'tiny_yolov2_model'
const DEFAULT_MODEL_NAME_SEPARABLE_CONV = 'tiny_yolov2_separable_conv_model'
function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {
......@@ -30,35 +31,51 @@ function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {
return { conv, bn }
}
function extractSeparableConvParams(prefix: string): SeparableConvParams {
const depthwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/depthwise_filter`, 4)
const pointwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/pointwise_filter`, 4)
const bias = extractWeightEntry<tf.Tensor1D>(`${prefix}/bias`, 1)
return new SeparableConvParams(
depthwise_filter,
pointwise_filter,
bias
)
}
return {
extractConvParams,
extractConvWithBatchNormParams
extractConvWithBatchNormParams,
extractSeparableConvParams
}
}
export async function loadQuantizedParams(
uri: string | undefined
uri: string | undefined,
withSeparableConvs: boolean
): Promise<{ params: NetParams, paramMappings: ParamMapping[] }> {
const weightMap = await loadWeightMap(uri, DEFAULT_MODEL_NAME)
const weightMap = await loadWeightMap(uri, withSeparableConvs ? DEFAULT_MODEL_NAME_SEPARABLE_CONV : DEFAULT_MODEL_NAME)
const paramMappings: ParamMapping[] = []
const {
extractConvParams,
extractConvWithBatchNormParams
extractConvWithBatchNormParams,
extractSeparableConvParams
} = extractorsFactory(weightMap, paramMappings)
const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams
const params = {
conv0: extractConvWithBatchNormParams('conv0'),
conv1: extractConvWithBatchNormParams('conv1'),
conv2: extractConvWithBatchNormParams('conv2'),
conv3: extractConvWithBatchNormParams('conv3'),
conv4: extractConvWithBatchNormParams('conv4'),
conv5: extractConvWithBatchNormParams('conv5'),
conv6: extractConvWithBatchNormParams('conv6'),
conv7: extractConvWithBatchNormParams('conv7'),
conv0: extractConvFn('conv0'),
conv1: extractConvFn('conv1'),
conv2: extractConvFn('conv2'),
conv3: extractConvFn('conv3'),
conv4: extractConvFn('conv4'),
conv5: extractConvFn('conv5'),
conv6: extractConvFn('conv6'),
conv7: extractConvFn('conv7'),
conv8: extractConvParams('conv8')
}
......
import * as tf from '@tensorflow/tfjs-core';
import { ConvParams } from '../commons/types';
import { Point } from '../Point';
export type BatchNorm = {
sub: tf.Tensor1D
truediv: tf.Tensor1D
}
export class SeparableConvParams {
constructor(
public depthwise_filter: tf.Tensor4D,
public pointwise_filter: tf.Tensor4D,
public bias: tf.Tensor1D
) {}
}
export type ConvWithBatchNorm = {
conv: ConvParams
bn: BatchNorm
}
export type NetParams = {
conv0: ConvWithBatchNorm
conv1: ConvWithBatchNorm
conv2: ConvWithBatchNorm
conv3: ConvWithBatchNorm
conv4: ConvWithBatchNorm
conv5: ConvWithBatchNorm
conv6: ConvWithBatchNorm
conv7: ConvWithBatchNorm
conv0: ConvWithBatchNorm | SeparableConvParams
conv1: ConvWithBatchNorm | SeparableConvParams
conv2: ConvWithBatchNorm | SeparableConvParams
conv3: ConvWithBatchNorm | SeparableConvParams
conv4: ConvWithBatchNorm | SeparableConvParams
conv5: ConvWithBatchNorm | SeparableConvParams
conv6: ConvWithBatchNorm | SeparableConvParams
conv7: ConvWithBatchNorm | SeparableConvParams
conv8: ConvParams
}
......@@ -35,3 +44,8 @@ export type TinyYolov2ForwardParams = {
inputSize?: SizeType | number
scoreThreshold?: number
}
export type PostProcessingParams = {
scoreThreshold?: number
paddings: Point
}
\ No newline at end of file
......@@ -15,6 +15,14 @@ export function round(num: number) {
return Math.floor(num * 100) / 100
}
export function sigmoid(x: number) {
return 1 / (1 + Math.exp(-x))
}
export function isDimensions(obj: any): boolean {
return obj && obj.width && obj.height
}
export function resolveInput(arg: string | any) {
if (typeof arg === 'string') {
return document.getElementById(arg)
......@@ -37,14 +45,14 @@ export function awaitMediaLoaded(media: HTMLImageElement | HTMLVideoElement | HT
if (!e.currentTarget) return
e.currentTarget.removeEventListener('load', onLoad)
e.currentTarget.removeEventListener('error', onError)
resolve()
resolve(e)
}
function onError(e: Event) {
if (!e.currentTarget) return
e.currentTarget.removeEventListener('load', onLoad)
e.currentTarget.removeEventListener('error', onError)
reject()
reject(e)
}
media.addEventListener('load', onLoad)
......
......@@ -6,7 +6,7 @@ import { toNetInput } from '../../../src';
import * as tf from '@tensorflow/tfjs-core';
import { Point } from '../../../src/Point';
describe('allFaces', () => {
describe('allFacesSsdMobilenetv1', () => {
let imgEl: HTMLImageElement
let facesFaceLandmarkPositions: Point[][]
......@@ -19,14 +19,14 @@ describe('allFaces', () => {
facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
})
describeWithNets('computes full face descriptions', { withAllFaces: true }, ({ allFaces }) => {
describeWithNets('computes full face descriptions', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {
const expectedScores = [0.97, 0.88, 0.83, 0.82, 0.59, 0.52]
const maxBoxDelta = 5
const maxLandmarkPointsDelta = 1
it('scores > 0.8', async () => {
const results = await allFaces(imgEl, 0.8)
const results = await allFacesSsdMobilenetv1(imgEl, 0.8)
expect(results.length).toEqual(4)
results.forEach(({ detection, landmarks, descriptor }, i) => {
......@@ -40,7 +40,7 @@ describe('allFaces', () => {
})
it('scores > 0.5', async () => {
const results = await allFaces(imgEl, 0.5)
const results = await allFacesSsdMobilenetv1(imgEl, 0.5)
expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => {
......@@ -55,11 +55,11 @@ describe('allFaces', () => {
})
describeWithNets('no memory leaks', { withAllFaces: true }, ({ allFaces }) => {
describeWithNets('no memory leaks', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {
it('single image element', async () => {
await expectAllTensorsReleased(async () => {
await allFaces(imgEl)
await allFacesSsdMobilenetv1(imgEl)
})
})
......@@ -68,7 +68,7 @@ describe('allFaces', () => {
await expectAllTensorsReleased(async () => {
const netInput = (new NetInput([tensor])).managed()
await allFaces(netInput)
await allFacesSsdMobilenetv1(netInput)
})
tensor.dispose()
......@@ -78,7 +78,7 @@ describe('allFaces', () => {
const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D
await expectAllTensorsReleased(async () => {
await allFaces(await toNetInput(tensor, true))
await allFacesSsdMobilenetv1(await toNetInput(tensor, true))
})
tensor.dispose()
......
import * as faceapi from '../../../src';
import { describeWithNets, expectAllTensorsReleased, expectRectClose, expectPointClose, expectMaxDelta } from '../../utils';
import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
import { NetInput } from '../../../src/NetInput';
import { toNetInput } from '../../../src';
import * as tf from '@tensorflow/tfjs-core';
import { Point } from '../../../src/Point';
import { SizeType } from '../../../src/tinyYolov2/types';
describe('allFacesTinyYolov2', () => {
let imgEl: HTMLImageElement
let facesFaceLandmarkPositions: Point[][]
let facesFaceDescriptors: number[][]
beforeAll(async () => {
const img = await (await fetch('base/test/images/faces.jpg')).blob()
imgEl = await faceapi.bufferToImage(img)
facesFaceLandmarkPositions = await (await fetch('base/test/data/facesFaceLandmarkPositions.json')).json()
facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
})
describeWithNets('computes full face descriptions', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
it('SizeType.LG', async () => {
const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
const maxBoxDelta = 5
const maxLandmarkPointsDelta = 10
const maxDescriptorDelta = 0.06
const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.LG })
const detectionOrder = [0, 2, 3, 4, 1, 5]
expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => {
expect(detection.getImageWidth()).toEqual(imgEl.width)
expect(detection.getImageHeight()).toEqual(imgEl.height)
expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[i], maxBoxDelta)
landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
})
})
it('SizeType.MD', async () => {
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const maxLandmarkPointsDelta = 16
const maxDescriptorDelta = 0.05
const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.MD })
const boxOrder = [5, 1, 4, 3, 2, 0]
const detectionOrder = [5, 2, 1, 4, 3, 0]
expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => {
expect(detection.getImageWidth()).toEqual(imgEl.width)
expect(detection.getImageHeight()).toEqual(imgEl.height)
expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
})
})
})
describeWithNets('no memory leaks', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
it('single image element', async () => {
await expectAllTensorsReleased(async () => {
await allFacesTinyYolov2(imgEl)
})
})
it('single tf.Tensor3D', async () => {
const tensor = tf.fromPixels(imgEl)
await expectAllTensorsReleased(async () => {
const netInput = (new NetInput([tensor])).managed()
await allFacesTinyYolov2(netInput)
})
tensor.dispose()
})
it('single batch size 1 tf.Tensor4Ds', async () => {
const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D
await expectAllTensorsReleased(async () => {
await allFacesTinyYolov2(await toNetInput(tensor, true))
})
tensor.dispose()
})
})
})
\ No newline at end of file
......@@ -29,6 +29,15 @@ export const expectedTinyYolov2Boxes = [
{ x: 87, y: 30, width: 92, height: 93 }
]
export const expectedTinyYolov2SeparableConvBoxes = [
{ x: 42, y: 257, width: 111, height: 121 },
{ x: 454, y: 175, width: 104, height: 121 },
{ x: 230, y: 45, width: 94, height: 104 },
{ x: 574, y: 62, width: 88, height: 113 },
{ x: 260, y: 233, width: 82, height: 104 },
{ x: 83, y: 24, width: 85, height: 111 }
]
export const expectedMtcnnFaceLandmarks = [
[new Point(117, 58), new Point(156, 63), new Point(141, 86), new Point(109, 98), new Point(147, 104)],
[new Point(82, 292), new Point(134, 304), new Point(104, 330), new Point(72, 342), new Point(120, 353)],
......@@ -38,7 +47,6 @@ export const expectedMtcnnFaceLandmarks = [
[new Point(489, 224), new Point(534, 223), new Point(507, 250), new Point(493, 271), new Point(530, 270)]
]
export function expectMtcnnResults(
results: { faceDetection: faceapi.FaceDetection, faceLandmarks: faceapi.FaceLandmarks5 }[],
boxOrder: number[],
......
import * as faceapi from '../../../src';
import { SizeType } from '../../../src/tinyYolov2/types';
import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
import { expectedTinyYolov2Boxes } from './expectedResults';
import { expectedTinyYolov2Boxes, expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
describe('tinyYolov2', () => {
......@@ -13,7 +12,7 @@ describe('tinyYolov2', () => {
imgEl = await faceapi.bufferToImage(img)
})
describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => {
describeWithNets('quantized weights', { withTinyYolov2: { quantized: true, withSeparableConv: false } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
......@@ -59,7 +58,7 @@ describe('tinyYolov2', () => {
})
describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false } }, ({ tinyYolov2 }) => {
describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false, withSeparableConv: false } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
......@@ -113,7 +112,7 @@ describe('tinyYolov2', () => {
await expectAllTensorsReleased(async () => {
const res = await fetch('base/weights_uncompressed/tiny_yolov2_model.weights')
const weights = new Float32Array(await res.arrayBuffer())
const net = faceapi.createTinyYolov2(weights)
const net = faceapi.createTinyYolov2(weights, false)
net.dispose()
})
})
......@@ -124,8 +123,8 @@ describe('tinyYolov2', () => {
it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => {
const net = new faceapi.TinyYolov2()
await net.load('base/weights')
const net = new faceapi.TinyYolov2(false)
await net.load('base/weights_unused')
net.dispose()
})
})
......
import * as faceapi from '../../../src';
import { SizeType } from '../../../src/tinyYolov2/types';
import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
describe('tinyYolov2, with separable convolutions', () => {
let imgEl: HTMLImageElement
beforeAll(async () => {
const img = await (await fetch('base/test/images/faces.jpg')).blob()
imgEl = await faceapi.bufferToImage(img)
})
describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
const maxBoxDelta = 1
const boxOrder = [0, 1, 2, 3, 4, 5]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
it('inputSize md, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.MD })
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const boxOrder = [5, 1, 4, 3, 2, 0]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
it('inputSize custom, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: 416 })
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const boxOrder = [5, 1, 4, 3, 2, 0]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
})
describe('no memory leaks', () => {
describe('NeuralNetwork, uncompressed model', () => {
it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => {
const res = await fetch('base/weights_uncompressed/tiny_yolov2_separable_conv_model.weights')
const weights = new Float32Array(await res.arrayBuffer())
const net = faceapi.createTinyYolov2(weights)
net.dispose()
})
})
})
describe('NeuralNetwork, quantized model', () => {
it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => {
const net = new faceapi.TinyYolov2()
await net.load('base/weights')
net.dispose()
})
})
})
})
})
\ No newline at end of file
......@@ -4,9 +4,8 @@ import { IRect } from '../build/Rect';
import * as faceapi from '../src/';
import { NeuralNetwork } from '../src/commons/NeuralNetwork';
import { IPoint } from '../src/';
import { allFacesFactory, allFacesMtcnnFactory } from '../src/allFacesFactory';
import { allFacesMtcnnFunction, allFacesFunction, tinyYolov2 } from '../src/globalApi';
import { TinyYolov2 } from '../src/tinyYolov2/TinyYolov2';
import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from '../src/allFacesFactory';
import { allFacesMtcnnFunction, allFacesSsdMobilenetv1Function, allFacesTinyYolov2, allFacesTinyYolov2Function } from '../src/globalApi';
export function zeros(length: number): Float32Array {
return new Float32Array(length)
......@@ -55,8 +54,13 @@ export type WithNetOptions = {
quantized?: boolean
}
export type WithTinyYolov2Options = WithNetOptions & {
withSeparableConv?: boolean
}
export type InjectNetArgs = {
allFaces: allFacesFunction
allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function
allFacesTinyYolov2: allFacesTinyYolov2Function
allFacesMtcnn: allFacesMtcnnFunction
faceDetectionNet: faceapi.FaceDetectionNet
faceLandmarkNet: faceapi.FaceLandmarkNet
......@@ -67,13 +71,14 @@ export type InjectNetArgs = {
export type DescribeWithNetsOptions = {
withAllFaces?: boolean
withAllFacesSsdMobilenetv1?: boolean
withAllFacesTinyYolov2?: boolean
withAllFacesMtcnn?: boolean
withFaceDetectionNet?: WithNetOptions
withFaceLandmarkNet?: WithNetOptions
withFaceRecognitionNet?: WithNetOptions
withMtcnn?: WithNetOptions
withTinyYolov2?: WithNetOptions
withTinyYolov2?: WithTinyYolov2Options
}
async function loadNetWeights(uri: string): Promise<Float32Array> {
......@@ -82,12 +87,13 @@ async function loadNetWeights(uri: string): Promise<Float32Array> {
async function initNet<TNet extends NeuralNetwork<any>>(
net: TNet,
uncompressedFilename: string | boolean
uncompressedFilename: string | boolean,
isUnusedModel: boolean = false
) {
await net.load(
uncompressedFilename
? await loadNetWeights(`base/weights_uncompressed/${uncompressedFilename}`)
: 'base/weights'
: (isUnusedModel ? 'base/weights_unused' : 'base/weights')
)
}
......@@ -102,13 +108,15 @@ export function describeWithNets(
let faceLandmarkNet: faceapi.FaceLandmarkNet = new faceapi.FaceLandmarkNet()
let faceRecognitionNet: faceapi.FaceRecognitionNet = new faceapi.FaceRecognitionNet()
let mtcnn: faceapi.Mtcnn = new faceapi.Mtcnn()
let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2()
let allFaces = allFacesFactory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet)
let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2(options.withTinyYolov2 && options.withTinyYolov2.withSeparableConv)
let allFacesSsdMobilenetv1 = allFacesSsdMobilenetv1Factory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet)
let allFacesTinyYolov2 = allFacesTinyYolov2Factory(tinyYolov2, faceLandmarkNet, faceRecognitionNet)
let allFacesMtcnn = allFacesMtcnnFactory(mtcnn, faceRecognitionNet)
beforeAll(async () => {
const {
withAllFaces,
withAllFacesSsdMobilenetv1,
withAllFacesTinyYolov2,
withAllFacesMtcnn,
withFaceDetectionNet,
withFaceLandmarkNet,
......@@ -117,21 +125,21 @@ export function describeWithNets(
withTinyYolov2
} = options
if (withFaceDetectionNet || withAllFaces) {
if (withFaceDetectionNet || withAllFacesSsdMobilenetv1) {
await initNet<faceapi.FaceDetectionNet>(
faceDetectionNet,
!!withFaceDetectionNet && !withFaceDetectionNet.quantized && 'ssd_mobilenetv1_model.weights'
)
}
if (withFaceLandmarkNet || withAllFaces) {
if (withFaceLandmarkNet || withAllFacesSsdMobilenetv1 || withAllFacesTinyYolov2) {
await initNet<faceapi.FaceLandmarkNet>(
faceLandmarkNet,
!!withFaceLandmarkNet && !withFaceLandmarkNet.quantized && 'face_landmark_68_model.weights'
)
}
if (withFaceRecognitionNet || withAllFaces || withAllFacesMtcnn) {
if (withFaceRecognitionNet || withAllFacesSsdMobilenetv1 || withAllFacesMtcnn || withAllFacesTinyYolov2) {
await initNet<faceapi.FaceRecognitionNet>(
faceRecognitionNet,
// TODO: figure out why quantized weights results in NaNs in testcases
......@@ -146,10 +154,11 @@ export function describeWithNets(
)
}
if (withTinyYolov2) {
if (withTinyYolov2 || withAllFacesTinyYolov2) {
await initNet<faceapi.TinyYolov2>(
tinyYolov2,
!!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights'
!!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights',
withTinyYolov2 && withTinyYolov2.withSeparableConv === false
)
}
})
......@@ -162,7 +171,16 @@ export function describeWithNets(
tinyYolov2 && tinyYolov2.dispose()
})
specDefinitions({ allFaces, allFacesMtcnn, faceDetectionNet, faceLandmarkNet, faceRecognitionNet, mtcnn, tinyYolov2 })
specDefinitions({
allFacesSsdMobilenetv1,
allFacesTinyYolov2,
allFacesMtcnn,
faceDetectionNet,
faceLandmarkNet,
faceRecognitionNet,
mtcnn,
tinyYolov2
})
})
}
......@@ -9,9 +9,10 @@ app.use(express.static(path.join(__dirname, './public')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights')))
app.use(express.static(path.join(__dirname, '../../weights_uncompressed')))
app.use(express.static(path.join(__dirname, '../../dist')))
app.get('/', (req, res) => res.redirect('/quantize_tiny_yolov2'))
app.get('/quantize_tiny_yolov2', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeTinyYolov2.html')))
app.get('/', (req, res) => res.redirect('/quantize_model'))
app.get('/quantize_model', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeModel.html')))
app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
......@@ -10,6 +10,10 @@
<script>
tf = faceapi.tf
const modelName = 'tiny_yolov2_separable_conv'
const uncompressedWeightsUri = `tiny_yolov2_separable_conv_model_v1.weights`
const net = new faceapi.TinyYolov2(true)
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
......@@ -36,15 +40,13 @@
}
async function quantizeAndSave() {
const modelName = 'tiny_yolov2'
const tinyYolov2 = new faceapi.TinyYolov2()
await tinyYolov2.load(await loadNetWeights(`uncompressed/tiny_yolov2_model.weights`))
await net.load(await loadNetWeights(uncompressedWeightsUri))
const quantizedTensorArrays = []
const weightEntries = []
tinyYolov2.getParamList().forEach(({ path, tensor }) => {
net.getParamList().forEach(({ path, tensor }) => {
const { scale, min, qdata } = quantizeWeights(tensor)
const weightEntry = {
......
async function promiseSequential(promises) {
const curr = promises[0]
if (!curr) {
return
}
await curr()
return promiseSequential(promises.slice(1))
}
async function trainStep(batchCreators) {
await promiseSequential(batchCreators.map((batchCreator, dataIdx) => async () => {
const { batchInput, landmarksBatchTensor } = await batchCreator()
let ts = Date.now()
const cost = optimizer.minimize(() => {
const out = window.trainNet.forwardInput(batchInput.managed())
......@@ -19,7 +10,7 @@ async function trainStep(batchCreators) {
landmarksBatchTensor,
out
)
return loss
return tf.sum(out)
}, true)
ts = Date.now() - ts
......@@ -29,6 +20,7 @@ async function trainStep(batchCreators) {
cost.dispose()
await tf.nextFrame()
console.log(tf.memory())
}))
}
......@@ -73,7 +65,7 @@ function landmarkPositionsToArray(landmarks) {
}
function toFaceLandmarks(landmarks, { naturalWidth, naturalHeight }) {
return new faceapi.FaceLandmarks(
return new faceapi.FaceLandmarks68(
landmarks.map(l => new faceapi.Point(l.x / naturalWidth, l.y / naturalHeight)),
{ width: naturalWidth, height: naturalHeight }
)
......@@ -100,20 +92,11 @@ async function getTrainData() {
(_, i) => landmarksJson[i]
)
return await loadImagesInBatch(allLandmarks.slice(0, 100))
/**
const batch1 = await loadImagesInBatch(allLandmarks.slice(0, 4000))
const batch2 = await loadImagesInBatch(allLandmarks.slice(4000), 4000)
return batch1.concat(batch2)
}
// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
function shuffle(a) {
var j, x, i;
for (i = a.length - 1; i > 0; i--) {
j = Math.floor(Math.random() * (i + 1));
x = a[i];
a[i] = a[j];
a[j] = x;
}
return a;
*/
}
\ No newline at end of file
......@@ -60,7 +60,7 @@ async function init() {
//window.nets.push(await loadNet('retrained/landmarks_v0.weights'))
//window.nets.push(await loadNet('retrained/landmarks_v2.weights'))
window.trainNet = await loadNet('retrained/landmarks_v6.weights')
window.trainNet = await loadNet('/tmp/retrained/landmarks_v9.weights')
window.nets.push(trainNet)
$('#loader').hide()
......
......@@ -10,6 +10,7 @@
<script src="faceLandmarksUi.js"></script>
<script src="faceLandmarksTrain.js"></script>
<script src="FileSaver.js"></script>
<script src="trainUtils.js"></script>
</head>
<body>
<div id="navbar"></div>
......@@ -60,7 +61,7 @@
await train()
}
async function train(batchSize = 10) {
async function train(batchSize = 1) {
for (let i = 0; i < trainSteps; i++) {
console.log('step', i)
const batchCreators = createBatchCreators(shuffle(window.trainData), batchSize)
......@@ -69,20 +70,11 @@
ts = Date.now() - ts
console.log('step %s done (%s ms)', i, ts)
if (((i + 1) % saveEveryNthIteration) === 0) {
saveWeights(i)
//saveWeights(window.trainNet, 'landmark_trained_weights_' + idx + '.weights')
}
}
}
function saveWeights(idx = 0) {
const binaryWeights = new Float32Array(
window.trainNet.getParamList()
.map(({ tensor }) => Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), 'landmark_trained_weights_' + idx + '.weights')
}
</script>
</body>
......
module.exports = function(config) {
config.set({
frameworks: ['jasmine', 'karma-typescript'],
files: [
'tinyYolov2/**/*.ts'
],
preprocessors: {
'**/*.ts': ['karma-typescript']
},
karmaTypescriptConfig: {
tsconfig: './tsconfig.test.json'
},
browsers: ['Chrome'],
browserNoActivityTimeout: 60000,
client: {
jasmine: {
timeoutInterval: 30000
}
}
})
}
{
"scripts": {
"start": "node server.js"
"start": "node server.js",
"test": "karma start"
},
"author": "justadudewhohacks",
"license": "MIT",
......
require('./.env')
require('./faceLandmarks/.env')
const express = require('express')
const path = require('path')
const app = express()
const viewsDir = path.join(__dirname, 'views')
app.use(express.static(viewsDir))
app.use(express.static(path.join(__dirname, './public')))
app.use(express.static(path.join(__dirname, './tmp')))
const publicDir = path.join(__dirname, './faceLandmarks')
app.use(express.static(publicDir))
app.use(express.static(path.join(__dirname, './shared')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights')))
......@@ -18,6 +17,6 @@ const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
app.use(express.static(trainDataPath))
app.get('/', (req, res) => res.redirect('/face_landmarks'))
app.get('/face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'faceLandmarks.html')))
app.get('/face_landmarks', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))
app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
require('./tinyYolov2/.env')
const express = require('express')
const path = require('path')
const fs = require('fs')
const app = express()
const publicDir = path.join(__dirname, './tinyYolov2')
app.use(express.static(publicDir))
app.use(express.static(path.join(__dirname, './shared')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights')))
app.use(express.static(path.join(__dirname, '../../dist')))
const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
const testDataPath = path.resolve(process.env.TEST_DATA_PATH)
const imagesPath = path.join(trainDataPath, './final_images')
const detectionsPath = path.join(trainDataPath, './final_detections')
app.use(express.static(imagesPath))
app.use(express.static(detectionsPath))
app.use(express.static(testDataPath))
const detectionFilenames = fs.readdirSync(detectionsPath)
const detectionFilenamesMultibox = JSON.parse(fs.readFileSync(path.join(__dirname, './tinyYolov2/multibox.json')))
app.use(express.static(trainDataPath))
app.get('/detection_filenames', (req, res) => res.status(202).send(detectionFilenames))
app.get('/detection_filenames_multibox', (req, res) => res.status(202).send(detectionFilenamesMultibox))
app.get('/', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))
app.get('/verify', (req, res) => res.sendFile(path.join(publicDir, 'verify.html')))
app.get('/test', (req, res) => res.sendFile(path.join(publicDir, 'test.html')))
app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
const log = (str, ...args) => console.log(`[${[(new Date()).toTimeString().substr(0, 8)]}] ${str || ''}`, ...args)
async function promiseSequential(promises) {
const curr = promises[0]
if (!curr) {
return
}
await curr()
return promiseSequential(promises.slice(1))
}
// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
function shuffle(a) {
var j, x, i;
for (i = a.length - 1; i > 0; i--) {
j = Math.floor(Math.random() * (i + 1));
x = a[i];
a[i] = a[j];
a[j] = x;
}
return a;
}
function saveWeights(net, filename = 'train_tmp') {
const binaryWeights = new Float32Array(
net.getParamList()
.map(({ tensor }) => Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), filename)
}
function imageToSquare(img) {
const scale = 608 / Math.max(img.height, img.width)
const width = scale * img.width
const height = scale * img.height
const canvas1 = faceapi.createCanvasFromMedia(img)
const targetCanvas = faceapi.createCanvas({ width: 608, height: 608 })
targetCanvas.getContext('2d').putImageData(canvas1.getContext('2d').getImageData(0, 0, width, height), 0, 0)
return targetCanvas
}
function getPaddingsAndReshapedSize(img, inputSize) {
const [h, w] = [img.height, img.width]
const maxDim = Math.max(h, w)
const f = inputSize / maxDim
const reshapedImgDims = {
height: Math.floor(h * f),
width: Math.floor(w * f)
}
const paddings = new faceapi.Point(
maxDim / img.width,
maxDim / img.height
)
return { paddings, reshapedImgDims }
}
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.12.0"> </script>
<script src="FileSaver.js"></script>
</head>
<body>
<script>
const glorotNormal = tf.initializers.glorotNormal()
saveWeights()
function initSeparableConvWeights(inChannels, outChannels) {
return {
depthwiseFilter: glorotNormal.apply([3, 3, inChannels, 1]),
pointwiseFilter: glorotNormal.apply([1, 1, inChannels, outChannels]),
bias: tf.zeros([outChannels])
}
}
function initTinyYolov2SeparableWeights() {
const conv0 = initSeparableConvWeights(3, 16)
const conv1 = initSeparableConvWeights(16, 32)
const conv2 = initSeparableConvWeights(32, 64)
const conv3 = initSeparableConvWeights(64, 128)
const conv4 = initSeparableConvWeights(128, 256)
const conv5 = initSeparableConvWeights(256, 512)
const conv6 = initSeparableConvWeights(512, 1024)
const conv7 = initSeparableConvWeights(1024, 1024)
const conv8 = {
filters: glorotNormal.apply([1, 1, 1024, 25]),
bias: tf.zeros([25])
}
return {
conv0,
conv1,
conv2,
conv3,
conv4,
conv5,
conv6,
conv7,
conv8
}
}
function saveWeights() {
const w = initTinyYolov2SeparableWeights()
const binaryWeights = new Float32Array(
Array(8).fill(0)
.map((_, i) => w[`conv${i}`])
.map(ps => [ps.depthwiseFilter, ps.pointwiseFilter, ps.bias])
.reduce((flat, arr) => flat.concat(arr))
.concat([w.conv8.filters, w.conv8.bias])
.map(tensor => console.log(tensor) || Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), 'foo.weights')
}
</script>
</body>
</html>
\ No newline at end of file
const CELL_SIZE = 32
const getNumCells = inputSize => inputSize / CELL_SIZE
const inverseSigmoid = x => Math.log(x / (1 - x))
function getAnchors() {
return window.net.anchors
}
function squaredSumOverMask(lossTensors, mask) {
return tf.tidy(() => tf.sum(tf.square(tf.mul(mask, lossTensors))))
}
function assignBoxesToAnchors(groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
return groundTruthBoxes.map(box => {
const { left, top, width, height } = box.rescale(reshapedImgDims)
const ctX = left + (width / 2)
const ctY = top + (height / 2)
const col = Math.floor((ctX / inputSize) * numCells)
const row = Math.floor((ctY / inputSize) * numCells)
const anchorsByIou = getAnchors().map((anchor, idx) => ({
idx,
iou: faceapi.iou(
new faceapi.BoundingBox(0, 0, anchor.x * CELL_SIZE, anchor.y * CELL_SIZE),
new faceapi.BoundingBox(0, 0, width, height)
)
})).sort((a1, a2) => a2.iou - a1.iou)
const anchor = anchorsByIou[0].idx
return { row, col, anchor, box }
})
}
function getGroundTruthMask(groundTruthBoxes, inputSize) {
const numCells = getNumCells(inputSize)
const mask = tf.zeros([numCells, numCells, 25])
const buf = mask.buffer()
groundTruthBoxes.forEach(({ row, col, anchor }) => {
const anchorOffset = anchor * 5
for (let i = 0; i < 5; i++) {
buf.set(1, row, col, anchorOffset + i)
}
})
return mask
}
function getCoordAndScoreMasks(inputSize) {
const numCells = getNumCells(inputSize)
const coordMask = tf.zeros([numCells, numCells, 25])
const scoreMask = tf.zeros([numCells, numCells, 25])
const coordBuf = coordMask.buffer()
const scoreBuf = scoreMask.buffer()
for (let row = 0; row < numCells; row++) {
for (let col = 0; col < numCells; col++) {
for (let anchor = 0; anchor < 5; anchor++) {
const anchorOffset = 5 * anchor
for (let i = 0; i < 4; i++) {
coordBuf.set(1, row, col, anchorOffset + i)
}
scoreBuf.set(1, row, col, anchorOffset + 4)
}
}
}
return { coordMask, scoreMask }
}
function computeBoxAdjustments(groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
const adjustments = tf.zeros([numCells, numCells, 25])
const buf = adjustments.buffer()
groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
const { left, top, right, bottom, width, height } = box.rescale(reshapedImgDims)
const centerX = (left + right) / 2
const centerY = (top + bottom) / 2
//const dCenterX = centerX - (col * CELL_SIZE + (CELL_SIZE / 2))
//const dCenterY = centerY - (row * CELL_SIZE + (CELL_SIZE / 2))
const dCenterX = centerX - (col * CELL_SIZE)
const dCenterY = centerY - (row * CELL_SIZE)
const dx = inverseSigmoid(dCenterX / CELL_SIZE)
const dy = inverseSigmoid(dCenterY / CELL_SIZE)
//const dx = dCenterX / CELL_SIZE
//const dy = dCenterY / CELL_SIZE
const dw = Math.log((width / CELL_SIZE) / getAnchors()[anchor].x)
const dh = Math.log((height / CELL_SIZE) / getAnchors()[anchor].y)
const anchorOffset = anchor * 5
buf.set(dx, row, col, anchorOffset + 0)
buf.set(dy, row, col, anchorOffset + 1)
buf.set(dw, row, col, anchorOffset + 2)
buf.set(dh, row, col, anchorOffset + 3)
})
return adjustments
}
function computeIous(predBoxes, groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
const isSameAnchor = p1 => p2 =>
p1.row === p2.row
&& p1.col === p2.col
&& p1.anchor === p2.anchor
const ious = tf.zeros([numCells, numCells, 25])
const buf = ious.buffer()
groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
const predBox = predBoxes.find(isSameAnchor({ row, col, anchor }))
if (!predBox) {
console.log(groundTruthBoxes)
console.log(predBoxes)
throw new Error(`no output box found for: row ${row}, col ${col}, anchor ${anchor}`)
}
const iou = faceapi.iou(
box.rescale(reshapedImgDims),
predBox.box.rescale(reshapedImgDims)
)
if (window.debug) {
console.log('ground thruth box:', box.rescale(reshapedImgDims).toRect())
console.log('predicted box:', predBox.box.rescale(reshapedImgDims).toRect())
console.log('predicted score:', predBox.score)
console.log('iou:', iou)
}
const anchorOffset = anchor * 5
buf.set(iou, row, col, anchorOffset + 4)
})
return ious
}
window.computeNoObjectLoss = function(outTensor, mask) {
return tf.tidy(() => {
const lossTensor = tf.sigmoid(outTensor)
return squaredSumOverMask(lossTensor, mask)
})
}
function computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, mask) {
return tf.tidy(() => {
const predBoxes = window.net.postProcess(
outTensor,
{ paddings }
)
const ious = computeIous(
predBoxes,
groundTruthBoxes,
reshapedImgDims
)
const lossTensor = tf.sub(ious, tf.sigmoid(outTensor))
return squaredSumOverMask(lossTensor, mask)
})
}
function computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, mask, paddings) {
return tf.tidy(() => {
const boxAdjustments = computeBoxAdjustments(
groundTruthBoxes,
reshapedImgDims
)
if (window.debug) {
const indToPos = []
const numCells = outTensor.shape[1]
for (let row = 0; row < numCells; row++) {
for (let col = 0; col < numCells; col++) {
for (let anchor = 0; anchor < 25; anchor++) {
indToPos.push({ row, col, anchor: parseInt(anchor / 5) })
}
}
}
const indices = Array.from(mask.dataSync()).map((val, ind) => ({ val, ind })).filter(v => v.val !== 0).map(v => v.ind)
const gt = Array.from(boxAdjustments.dataSync())
const out = Array.from(outTensor.dataSync())
const comp = indices.map(i => (
{
pos: indToPos[i],
gt: gt[i],
out: out[i]
}
))
console.log(comp.map(c => `gt: ${c.gt}, out: ${c.out}`))
const getBbox = (which) => {
const { row, col, anchor } = comp[0].pos
const ctX = ((col + faceapi.sigmoid(comp[0][which])) / numCells) * paddings.x
const ctY = ((row + faceapi.sigmoid(comp[1][which])) / numCells) * paddings.y
const width = ((Math.exp(comp[2][which]) * getAnchors()[anchor].x) / numCells) * paddings.x
const height = ((Math.exp(comp[3][which]) * getAnchors()[anchor].y) / numCells) * paddings.y
const x = (ctX - (width / 2))
const y = (ctY - (height / 2))
return new faceapi.BoundingBox(x, y, x + width, y + height)
}
const outRect = getBbox('out').rescale(reshapedImgDims).toRect()
const gtRect = getBbox('gt').rescale(reshapedImgDims).toRect()
console.log('out', outRect)
console.log('gtRect', gtRect)
}
const lossTensor = tf.sub(boxAdjustments, outTensor)
return squaredSumOverMask(lossTensor, mask)
})
}
function computeLoss(outTensor, groundTruth, reshapedImgDims, paddings) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
if (!inputSize) {
throw new Error(`invalid inputSize: ${inputSize}`)
}
let groundTruthBoxes = assignBoxesToAnchors(
groundTruth
.map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
.map(rect => rect.toBoundingBox()),
reshapedImgDims
)
const groundTruthMask = getGroundTruthMask(groundTruthBoxes, inputSize)
const { coordMask, scoreMask } = getCoordAndScoreMasks(inputSize)
const noObjectLossMask = tf.tidy(() => tf.mul(scoreMask, tf.sub(tf.scalar(1), groundTruthMask)))
const objectLossMask = tf.tidy(() => tf.mul(scoreMask, groundTruthMask))
const coordLossMask = tf.tidy(() => tf.mul(coordMask, groundTruthMask))
const noObjectLoss = tf.tidy(() =>
tf.mul(
tf.scalar(noObjectScale),
computeNoObjectLoss(outTensor, noObjectLossMask)
)
)
const objectLoss = tf.tidy(() =>
tf.mul(
tf.scalar(objectScale),
computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, objectLossMask)
)
)
const coordLoss = tf.tidy(() =>
tf.mul(
tf.scalar(coordScale),
computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, coordLossMask, paddings)
)
)
const totalLoss = tf.tidy(() => noObjectLoss.add(objectLoss).add(coordLoss))
return {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
}
}
\ No newline at end of file
import * as _tf from '@tensorflow/tfjs-core';
const faceapi = require('../../../dist/face-api.js')
const tf: typeof _tf = faceapi.tf
require('./loss')
window['faceapi'] = faceapi
window['tf'] = tf
const anchors = [
new faceapi.Point(1.603231, 2.094468),
new faceapi.Point(6.041143, 7.080126),
new faceapi.Point(2.882459, 3.518061),
new faceapi.Point(4.266906, 5.178857),
new faceapi.Point(9.041765, 10.66308)
]
window['net'] = {
getAnchors() {
return anchors
}
}
describe('loss', () => {
describe('computeNoObjectLoss', () => {
const computeNoObjectLoss = window['computeNoObjectLoss']
it('should only compute loss over scores, 1x1 grid', () => tf.tidy(() => {
const outTensor = tf.zeros([1, 1, 1, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5)
}))
it('should only compute loss over scores, 13x13 grid', () => tf.tidy(() => {
const outTensor = tf.zeros([1, 13, 13, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13)
}))
it('should only compute loss over scores, 13x13 grid, batchSize: 10', () => tf.tidy(() => {
const outTensor = tf.zeros([10, 13, 13, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13 * 10)
}))
})
})
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<script src="trainUtils.js"></script>
<script src="loss.js"></script>
<script src="FileSaver.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<script>
window.tf = faceapi.tf
// hyper parameters
window.objectScale = 5
window.noObjectScale = 1
window.coordScale = 1
const weightsUrl = `/tmp/tmp__224_35060__320_41188__416_31050__608_16520.weights`
//const inputSizes = [160, 224, 320, 416]
const inputSizes = [512, 608]
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchFddbJson() {
return fetch('/fddb-detections.json').then(res => res.json())
}
async function run() {
window.fddbJson = await fetchFddbJson()
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
await window.net.load(weights)
await promiseSequential(inputSizes.map(inputSize => async () => {
await promiseSequential(window.fddbJson.map(({ filePath, rects }) => () => {
return test(filePath, rects, inputSize)
}))
const losses = Object.keys(window.lossMap[inputSize]).map(k => window.lossMap[inputSize][k])
const totalLoss = losses
.map(l => l.totalLoss)
.reduce((sum, l) => sum + l)
const avgLoss = totalLoss / losses.length
log(`totalLoss (${inputSize}): ${totalLoss}`)
log(`avgLoss (${inputSize}): ${avgLoss}`)
window.losses = window.losses || {}
window.losses[inputSize] = { totalLoss, avgLoss }
}))
console.log(window.losses)
}
async function test(fileUri, rects, inputSize) {
const img = await faceapi.bufferToImage(await fetchImage(fileUri))
const groundTruthBoxes = rects
.map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
.map(rect => rect.clipAtImageBorders(img.width, img.height))
.map(({ x, y, width, height }) => ({
x: x / img.width,
y: y / img.height,
width: width / img.width,
height: height / img.height,
}))
const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
const squareImg = imageToSquare(img)
const netInput = (await faceapi.toNetInput(squareImg)).managed()
const losses = tf.tidy(() => {
const outTensor = window.net.forwardInput(netInput, inputSize)
const {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
} = computeLoss(
outTensor,
groundTruthBoxes,
reshapedImgDims,
paddings
)
const losses = {
totalLoss: totalLoss.dataSync()[0],
noObjectLoss: noObjectLoss.dataSync()[0],
objectLoss: objectLoss.dataSync()[0],
coordLoss: coordLoss.dataSync()[0]
}
return losses
})
log(`${fileUri}:`)
log(`ground truth boxes: ${groundTruthBoxes.length}`)
log(`noObjectLoss: ${losses.noObjectLoss}`)
log(`objectLoss: ${losses.objectLoss}`)
log(`coordLoss: ${losses.coordLoss}`)
log(`totalLoss: ${losses.totalLoss}`)
if (Object.keys(losses).map(k => losses[k]).some(loss => isNaN(loss) || loss === Infinity)) {
console.log(groundTruthBoxes)
console.log(img)
console.log(losses)
throw new Error('corrupted loss value')
}
window.lossMap = window.lossMap || {}
window.lossMap[inputSize] = window.lossMap[inputSize] || {}
window.lossMap[inputSize][fileUri] = losses
}
$(document).ready(function() {
run()
})
</script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<script src="FileSaver.js"></script>
<script src="trainUtils.js"></script>
<script src="train.js"></script>
<script src="loss.js"></script>
</head>
<body>
<script>
tf = faceapi.tf
const startIdx160 = 26600
const startIdx224 = 61660
const startIdx320 = 67788
const startIdx416 = 57650
const startIdx608 = 16520
//const weightsUrl = `/tmp/tmp__160_${startIdx160}__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
const weightsUrl = `/tmp/tmp_multiscale_count_8700.weights`
const fromEpoch = 0
const trainOnlyMultibox = false
const trainSizes = [416, 512, 608]
//const trainSizes = [608]
window.debug = false
window.logTrainSteps = true
window.count = 0
// hyper parameters
window.objectScale = 5
window.noObjectScale = 1
window.coordScale = 1
const rescaleEveryNthBatch = 100
window.saveEveryNthDataIdx = trainSizes.length * rescaleEveryNthBatch
window.trainSteps = 4000
//window.optimizer = tf.train.sgd(0.001)
window.optimizer = tf.train.adam(0.001, 0.9, 0.999, 1e-8)
// all samples
//const dataStartIdx = 8000
const dataStartIdx = 0
const numTrainSamples = Infinity
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchDetectionFilenames() {
return fetch('/detection_filenames').then(res => res.json())
}
async function fetchDetectionFilenamesMultibox() {
return fetch('/detection_filenames_multibox').then(res => res.json())
}
async function run() {
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
window.net.load(weights)
window.net.variable()
const fetchDetectionsFn = trainOnlyMultibox
? fetchDetectionFilenamesMultibox
: fetchDetectionFilenames
window.detectionFilenames = (await fetchDetectionsFn()).slice(dataStartIdx, dataStartIdx + numTrainSamples)
window.lossMap = {}
console.log('ready')
}
function logLossChange(lossType) {
const { currentLoss, prevLoss, detectionFilenames } = window
log(`${lossType} : ${faceapi.round(currentLoss[lossType])} (avg: ${faceapi.round(currentLoss[lossType] / detectionFilenames.length)}) (delta: ${currentLoss[lossType] - prevLoss[lossType]})`)
}
function onBatchProcessed(dataIdx, inputSize) {
window.count++
const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
console.log('dataIdx', dataIdx)
if ((window.count % saveEveryNthDataIdx) === 0) {
saveWeights(window.net, `tmp_multiscale_count_${window.count}.weights`)
}
}
function _onBatchProcessed(dataIdx, inputSize) {
const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
console.log('idx', idx)
if ((idx % saveEveryNthDataIdx) === 0) {
saveWeights(window.net, `tmp__224_${startIdx224 + (inputSize === 224 ? idx : 0)}__320_${startIdx320 + (inputSize === 320 ? idx : 0)}__416_${startIdx416 + (inputSize === 416 ? idx : 0)}__608_${startIdx608 + (inputSize === 608 ? idx : 0)}.weights`)
}
}
async function train() {
const batchSize = 1
for (let i = fromEpoch; i < trainSteps; i++) {
window.epoch = i
log('step', i)
let ts2 = Date.now()
const batchCreators = createBatchCreators(shuffle(window.detectionFilenames), batchSize)
await trainStep(batchCreators, trainSizes, rescaleEveryNthBatch, onBatchProcessed)
ts2 = Date.now() - ts2
}
ts = Date.now() - ts
log()
log('--------------------')
log()
log('step %s done (%s ms)', i, ts)
window.prevLoss = window.currentLoss
window.currentLoss = Object.keys(lossMap)
.map(filename => lossMap[filename])
.reduce((accumulatedLosses, losses) =>
Object.keys(losses)
.map(key => ({
[key]: (accumulatedLosses[key] || 0) + losses[key]
}))
.reduce((map, curr) => ({ ...map, ...curr }), {}),
{}
)
if (window.prevLoss) {
logLossChange('noObjectLoss')
logLossChange('objectLoss')
logLossChange('coordLoss')
logLossChange('totalLoss')
}
log()
log('--------------------')
log()
}
run()
</script>
</body>
</html>
\ No newline at end of file
const batchIdx = 0
function minimize(groundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings }) {
const filename = batch.filenames[batchIdx]
const { dataIdx } = batch
return optimizer.minimize(() => {
const outTensor = window.net.forwardInput(batchInput, inputSize)
const {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
} = computeLoss(
outTensor,
groundTruthBoxes,
reshapedImgDims,
paddings
)
const losses = {
totalLoss: totalLoss.dataSync()[0],
noObjectLoss: noObjectLoss.dataSync()[0],
objectLoss: objectLoss.dataSync()[0],
coordLoss: coordLoss.dataSync()[0]
}
const lossKey = `${filename}_${inputSize}`
if (window.logTrainSteps) {
log(`ground truth boxes: ${groundTruthBoxes.length}`)
log(`noObjectLoss[${dataIdx}]: ${losses.noObjectLoss}`)
log(`objectLoss[${dataIdx}]: ${losses.objectLoss}`)
log(`coordLoss[${dataIdx}]: ${losses.coordLoss}`)
log(`totalLoss[${dataIdx}]: ${losses.totalLoss}`)
if (window.lossMap[lossKey]) {
log(`loss change: ${losses.totalLoss - window.lossMap[lossKey].totalLoss}`)
}
}
window.lossMap[lossKey] = losses
return totalLoss
}, true)
}
async function trainStep(batchCreators, inputSizes, rescaleEveryNthBatch, onBatchProcessed = () => {}) {
async function step(currentBatchCreators) {
if (!currentBatchCreators.batchCreators.length) {
return
}
await promiseSequential(inputSizes.map(inputSize => async () => {
await promiseSequential(currentBatchCreators.batchCreators.map(batchCreator => async () => {
const batch = await batchCreator()
const { imgs, groundTruthBoxes, filenames, dataIdx } = batch
const img = imgs[0]
const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
const squareImg = imageToSquare(img)
const batchInput = await faceapi.toNetInput(squareImg)
const [imgHeight, imgWidth] = batchInput.inputs[batchIdx].shape
// skip groundTruthBoxes, which are too tiny
const scaleFactor = inputSize / Math.max(imgHeight, imgWidth)
const filteredGroundTruthBoxes = groundTruthBoxes[batchIdx].filter(({ x, y, width, height }) => {
const box = (new faceapi.Rect(x, y, width, height))
.toBoundingBox()
.rescale({ height: imgHeight, width: imgWidth })
.rescale(scaleFactor)
const isTooTiny = box.width < 40 || box.height < 40
if (isTooTiny && window.debug) {
log(`skipping box for input size ${inputSize}: (${Math.floor(box.width)} x ${Math.floor(box.height)})`)
}
return !isTooTiny
})
if (!filteredGroundTruthBoxes.length) {
if (window.debug) {
log(`no boxes for input size ${inputSize}, ${groundTruthBoxes[batchIdx].length} boxes were too small`)
}
batchInput.dispose()
onBatchProcessed(dataIdx, inputSize)
return
}
let ts = Date.now()
const loss = minimize(filteredGroundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings })
ts = Date.now() - ts
if (window.logTrainSteps) {
log(`trainStep time for dataIdx ${dataIdx} (${inputSize}): ${ts} ms`)
}
loss.dispose()
batchInput.dispose()
onBatchProcessed(dataIdx, inputSize)
await tf.nextFrame()
}))
}))
await step(currentBatchCreators.next(rescaleEveryNthBatch))
}
await step(batchCreators.next(rescaleEveryNthBatch))
}
async function fetchGroundTruthBoxesForFile(file) {
const boxes = await fetch(file).then(res => res.json())
return {
file,
boxes
}
}
function createBatchCreators(detectionFilenames, batchSize) {
if (batchSize < 1) {
throw new Error('invalid batch size: ' + batchSize)
}
const batches = []
const pushToBatch = (remaining) => {
if (remaining.length) {
batches.push(remaining.slice(0, batchSize))
pushToBatch(remaining.slice(batchSize))
}
return batches
}
pushToBatch(detectionFilenames)
const batchCreators = batches.map((filenamesForBatch, dataIdx) => async () => {
const groundTruthBoxes = (await Promise.all(filenamesForBatch.map(fetchGroundTruthBoxesForFile)))
.map(({ boxes }) => boxes)
const imgs = await Promise.all(filenamesForBatch.map(
async file => await faceapi.bufferToImage(await fetchImage(file.replace('.json', '')))
))
return {
imgs,
groundTruthBoxes,
filenames: filenamesForBatch,
dataIdx
}
})
let idx = 0
function next(n) {
const nextBatchCreators = batchCreators.slice(idx, idx + n)
idx += n
return {
batchCreators: nextBatchCreators,
next
}
}
return {
data: batchCreators,
next
}
}
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<div id="navbar"></div>
<div class="center-content page-container">
<div class="progress" id="loader">
<div class="indeterminate"></div>
</div>
<div style="position: relative" class="margin">
<img id="inputImg" src="" style="max-width: 800px;" />
<canvas id="overlay" />
</div>
<div class="row side-by-side">
<div id="selectList"></div>
<div class="row">
<label for="imgUrlInput">Get image from URL:</label>
<input id="imgUrlInput" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="loadImageFromUrl()"
>
Ok
</button>
</div>
<div class="row side-by-side">
<div class="row input-field" style="margin-right: 20px;">
<select id="sizeType">
<option value="128">128 x 128</option>
<option value="160">160 x 160</option>
<option value="224">224 x 224</option>
<option value="320">320 x 320</option>
<option value="416">416 x 416</option>
<option value="608">608 x 608</option>
</select>
<label>Input Size</label>
</div>
<div class="row">
<label for="scoreThreshold">Score Threshold:</label>
<input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="onDecreaseThreshold()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn"
onclick="onIncreaseThreshold()"
>
<i class="material-icons left">+</i>
</button>
</div>
<div class="row">
<label for="imgByNr">Enter image NR: </label>
<input id="imgByNr" type="text" class="bold">
</div>
</div>
<script>
let scoreThreshold = 0.5
let sizeType = 608
function onKeyDown(e) {
e.target.value = (
parseInt(e.target.value) + (e.keyCode === 38 ? 1 : (e.keyCode === 40 ? -1 : 0))
) || e.target.value || 0
const imgUri = window.imgs[e.target.value]
console.log(imgUri)
onSelectionChanged(imgUri)
}
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onDecreaseThreshold() {
scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onSizeTypeChanged(e, c) {
sizeType = e.target.value
$('#sizeType').val(sizeType)
updateResults()
}
async function loadImageFromUrl(url) {
const img = await requestExternalImage($('#imgUrlInput').val())
$('#inputImg').get(0).src = img.src
updateResults()
}
async function updateResults() {
const inputImgEl = $('#inputImg').get(0)
const { width, height } = inputImgEl
const canvas = $('#overlay').get(0)
canvas.width = width
canvas.height = height
const forwardParams = {
inputSize: parseInt(sizeType),
scoreThreshold
}
const detections = await window.net.locateFaces(inputImgEl, forwardParams)
faceapi.drawDetection('overlay', detections.map(det => det.forSize(width, height)))
}
async function onSelectionChanged(uri) {
const imgBuf = await fetchImage(uri)
$(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
updateResults()
}
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchDetectionFilenames() {
return fetch('/detection_filenames').then(res => res.json())
}
async function run() {
$('#imgByNr').keydown(onKeyDown)
const startIdx224 = 35060
const startIdx320 = 41188
const startIdx416 = 31050
const startIdx608 = 16520
//const weightsUrl = `/tmp/tmp__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
const weightsUrl = `/tmp/tmp_multiscale_count_4200.weights`
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
await window.net.load(weights)
window.imgs = (await fetchDetectionFilenames()).map(f => f.replace('.json', ''))
$('#loader').hide()
onSelectionChanged($('#selectList select').val())
}
$(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_detection')
renderImageSelectList(
'#selectList',
async (uri) => {
await onSelectionChanged(uri)
},
'bbt1.jpg'
)
const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType)
sizeTypeSelect.on('change', onSizeTypeChanged)
sizeTypeSelect.material_select()
run()
})
</script>
</body>
</html>
\ No newline at end of file
{
"extends": "../../tsconfig.json",
"include": [
"tinyYolov2"
]
}
[{"weights":[{"name":"conv0/conv/filters","shape":[3,3,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0036540280370151294,"min":-0.44213739247883066}},{"name":"conv0/conv/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002164303555208094,"min":-0.13418682042290184}},{"name":"conv0/bn/sub","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0028029036288167914,"min":-0.3139252064274806}},{"name":"conv0/bn/truediv","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.05010388411727606,"min":2.7867696285247803}},{"name":"conv1/conv/filters","shape":[3,3,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0017398431020624497,"min":-0.22617960326811845}},{"name":"conv1/conv/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001145745112615473,"min":-0.12488621727508656}},{"name":"conv1/bn/sub","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00586619587505565,"min":-0.9503237317590153}},{"name":"conv1/bn/truediv","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.007056442896525065,"min":1.2661800384521484}},{"name":"conv2/conv/filters","shape":[3,3,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0010545693776186774,"min":-0.12971203344709734}},{"name":"conv2/conv/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0005329197820495157,"min":-0.07034541123053607}},{"name":"conv2/bn/sub","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.005863590567719703,"min":-0.691903686990925}},{"name":"conv2/bn/truediv","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.006374212339812634,"min":1.4001796245574951}},{"name":"conv3/conv/filters","shape":[3,3,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006935241175632851,"min":-0.08114232175490436}},{"name":"conv3/conv/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0002854522025468303,"min":-0.04224692597693088}},{"name":"conv3/bn/sub","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008516784275279325,"min":-1.1667994457132675}},{"name":"conv3/bn/truediv","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004096750652088838,"min":1.1394747495651245}},{"name":"conv4/conv/filters","shape":[3,3,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006537227946169236,"min":-0.06733344784554313}},{"name":"conv4/conv/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00018004616905076831,"min":-0.041770711219778246}},{"name":"conv4/bn/sub","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011157989969440535,"min":-1.5509606057522343}},{"name":"conv4/bn/truediv","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00421752836190018,"min":1.1967103481292725}},{"name":"conv5/conv/filters","shape":[3,3,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0008280676077393925,"min":-0.05299632689532112}},{"name":"conv5/conv/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00007708504312105623,"min":-0.020973851904273033}},{"name":"conv5/bn/sub","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011612189049814262,"min":-1.7302161684223252}},{"name":"conv5/bn/truediv","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0034869993434232826,"min":1.2963157892227173}},{"name":"conv6/conv/filters","shape":[3,3,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0003250038945207409,"min":-0.04030048292057187}},{"name":"conv6/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00009164303220114009,"min":-0.02338002622127533}},{"name":"conv6/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0168181017333386,"min":-2.068626513200648}},{"name":"conv6/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004455030198190726,"min":1.309550166130066}},{"name":"conv7/conv/filters","shape":[3,3,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00026194834533859704,"min":-0.033267439858001825}},{"name":"conv7/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0001491123554753322,"min":-0.011928988438026577}},{"name":"conv7/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009263983427309523,"min":-1.1765258952683093}},{"name":"conv7/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01123507022857666,"min":0.9360886812210083}},{"name":"conv8/filters","shape":[1,1,1024,30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001922343233052422,"min":-0.2614386796951294}},{"name":"conv8/bias","shape":[30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0009638834233377494,"min":-0.15904076485072866}}],"paths":["tiny_yolov2_model-shard1","tiny_yolov2_model-shard2","tiny_yolov2_model-shard3","tiny_yolov2_model-shard4"]}]
\ No newline at end of file
[{"weights":[{"name":"conv0/depthwise_filter","shape":[3,3,3,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004699238725737029,"min":-0.7471789573921876}},{"name":"conv0/pointwise_filter","shape":[1,1,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008118405529097015,"min":-1.071629529840806}},{"name":"conv0/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0024678509609372006,"min":-0.28873856242965246}},{"name":"conv1/depthwise_filter","shape":[3,3,16,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004553892331964829,"min":-0.5737904338275684}},{"name":"conv1/pointwise_filter","shape":[1,1,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00980057996862075,"min":-1.3230782957638012}},{"name":"conv1/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0011220066278588537,"min":-0.20644921952602907}},{"name":"conv2/depthwise_filter","shape":[3,3,32,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0032098570290733787,"min":-0.38839270051787883}},{"name":"conv2/pointwise_filter","shape":[1,1,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008682825051101984,"min":-1.154815731796564}},{"name":"conv2/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0015120926440930834,"min":-0.21471715546121783}},{"name":"conv3/depthwise_filter","shape":[3,3,64,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003597520496331009,"min":-0.4317024595597211}},{"name":"conv3/pointwise_filter","shape":[1,1,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010341314240997913,"min":-1.3650534798117246}},{"name":"conv3/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002109630785736383,"min":-0.4113780032185947}},{"name":"conv4/depthwise_filter","shape":[3,3,128,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004783747476689955,"min":-0.6171034244930043}},{"name":"conv4/pointwise_filter","shape":[1,1,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009566552498761345,"min":-1.2627849298364977}},{"name":"conv4/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0020002245903015135,"min":-0.3860433459281921}},{"name":"conv5/depthwise_filter","shape":[3,3,256,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004355777244941861,"min":-0.4791354969436047}},{"name":"conv5/pointwise_filter","shape":[1,1,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010036561068366555,"min":-1.2545701335458193}},{"name":"conv5/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0023248311935686597,"min":-0.42776893961663337}},{"name":"conv6/depthwise_filter","shape":[3,3,512,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004659063442080629,"min":-0.5963601205863205}},{"name":"conv6/pointwise_filter","shape":[1,1,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010061494509379069,"min":-1.2576868136723836}},{"name":"conv6/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0029680932269376867,"min":-0.3947563991827123}},{"name":"conv7/depthwise_filter","shape":[3,3,1024,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003887363508635876,"min":-0.48980780208812036}},{"name":"conv7/pointwise_filter","shape":[1,1,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009973861189449535,"min":-1.2766542322495404}},{"name":"conv7/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004667898486642276,"min":-0.6955168745096991}},{"name":"conv8/filters","shape":[1,1,1024,25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01573958116419175,"min":-2.5340725674348716}},{"name":"conv8/bias","shape":[25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009396760662396749,"min":-2.2552225589752197}}],"paths":["tiny_yolov2_separable_conv_model-shard1"]}]
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment