Unverified Commit 5fefdcd5 by justadudewhohacks Committed by GitHub

Merge pull request #72 from justadudewhohacks/tiny-yolov2-seperable-conv2d

Tiny yolov2 seperable conv2d
parents 661f228b 4b4ecdb0
...@@ -2,4 +2,6 @@ node_modules ...@@ -2,4 +2,6 @@ node_modules
.rpt2_cache .rpt2_cache
.env* .env*
tmp tmp
weights_uncompressed proto
\ No newline at end of file weights_uncompressed
weights_unused
\ No newline at end of file
...@@ -5,5 +5,6 @@ examples ...@@ -5,5 +5,6 @@ examples
proto proto
weights weights
weights_uncompressed weights_uncompressed
weights_unused
test test
tools tools
\ No newline at end of file
...@@ -27,7 +27,7 @@ Table of Contents: ...@@ -27,7 +27,7 @@ Table of Contents:
* **[Face Detection & 5 Point Face Landmarks - MTCNN](#usage-face-detection-mtcnn)** * **[Face Detection & 5 Point Face Landmarks - MTCNN](#usage-face-detection-mtcnn)**
* **[Face Recognition](#usage-face-recognition)** * **[Face Recognition](#usage-face-recognition)**
* **[68 Point Face Landmark Detection](#usage-face-landmark-detection)** * **[68 Point Face Landmark Detection](#usage-face-landmark-detection)**
* **[Full Face Detection and Recognition Pipeline](#usage-full-face-detection-and-recognition-pipeline)** * **[Shortcut Functions for Full Face Description](#shortcut-functions)**
## Examples ## Examples
...@@ -89,15 +89,15 @@ The face detection model has been trained on the [WIDERFACE dataset](http://mmla ...@@ -89,15 +89,15 @@ The face detection model has been trained on the [WIDERFACE dataset](http://mmla
### Face Detection - Tiny Yolo v2 ### Face Detection - Tiny Yolo v2
The Tiny Yolo v2 based face detector can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the model is not as accurate as SSD Mobilenet v1 but can achieve faster inference for lower image sizes. The Tiny Yolo v2 implementation is a very performant face detector, which can easily adapt to different input image sizes, thus can be used as an alternative to SSD Mobilenet v1 to trade off accuracy for performance (inference time). In general the models ability to locate smaller face bounding boxes is not as accurate as SSD Mobilenet v1.
The Tiny Yolo v2 implementation is still experimental, meaning there is room for optimization (future work). The trained model weights are provided in the [azFace](https://github.com/azmathmoosa/azFace) project. The face detector has been trained on a custom dataset of ~10K images labeled with bounding boxes and uses depthwise separable convolutions instead of regular convolutions, which ensures very fast inference and allows to have a quantized model size of only 1.7MB making the model extremely mobile and web friendly. Thus, the Tiny Yolo v2 face detector should be your GO-TO face detector on mobile devices.
<a name="about-face-detection-mtcnn"></a> <a name="about-face-detection-mtcnn"></a>
### Face Detection & 5 Point Face Landmarks - MTCNN ### Face Detection & 5 Point Face Landmarks - MTCNN
MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration and is able to achieve much lower processing times. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with 2MB, the size of the weights file is only a third of the size of the quantized SSD Mobilenet v1 model (~6MB). MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative face detector to SSD Mobilenet v1 and Tiny Yolo v2, which offers much more room for configuration. By tuning the input parameters, MTCNN is able to detect a wide range of face bounding box sizes. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with the model size is only 2MB.
MTCNN has been presented in the paper [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf) by Zhang et al. and the model weights are provided in the official [repo](https://github.com/kpzhang93/MTCNN_face_detection_alignment) of the MTCNN implementation. MTCNN has been presented in the paper [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf) by Zhang et al. and the model weights are provided in the official [repo](https://github.com/kpzhang93/MTCNN_face_detection_alignment) of the MTCNN implementation.
...@@ -164,7 +164,7 @@ await net.load('/models/face_detection_model-weights_manifest.json') ...@@ -164,7 +164,7 @@ await net.load('/models/face_detection_model-weights_manifest.json')
// await net.load('/models/face_landmark_68_model-weights_manifest.json') // await net.load('/models/face_landmark_68_model-weights_manifest.json')
// await net.load('/models/face_recognition_model-weights_manifest.json') // await net.load('/models/face_recognition_model-weights_manifest.json')
// await net.load('/models/mtcnn_model-weights_manifest.json') // await net.load('/models/mtcnn_model-weights_manifest.json')
// await net.load('/models/tiny_yolov2_model-weights_manifest.json') // await net.load('/models/tiny_yolov2_separable_conv_model-weights_manifest.json')
// or simply load all models // or simply load all models
await net.load('/models') await net.load('/models')
...@@ -197,7 +197,7 @@ const maxResults = 10 ...@@ -197,7 +197,7 @@ const maxResults = 10
// inputs can be html canvas, img or video element or their ids ... // inputs can be html canvas, img or video element or their ids ...
const myImg = document.getElementById('myImg') const myImg = document.getElementById('myImg')
const detections = await faceapi.locateFaces(myImg, minConfidence, maxResults) const detections = await faceapi.ssdMobilenetv1(myImg, minConfidence, maxResults)
``` ```
Draw the detected faces to a canvas: Draw the detected faces to a canvas:
...@@ -356,7 +356,7 @@ const rightEyeBrow = landmarks.getRightEyeBrow() ...@@ -356,7 +356,7 @@ const rightEyeBrow = landmarks.getRightEyeBrow()
Compute the Face Landmarks for Detected Faces: Compute the Face Landmarks for Detected Faces:
``` javascript ``` javascript
const detections = await faceapi.locateFaces(input) const detections = await faceapi.ssdMobilenetv1(input)
// get the face tensors from the image (have to be disposed manually) // get the face tensors from the image (have to be disposed manually)
const faceTensors = await faceapi.extractFaceTensors(input, detections) const faceTensors = await faceapi.extractFaceTensors(input, detections)
...@@ -366,50 +366,35 @@ const landmarksByFace = await Promise.all(faceTensors.map(t => faceapi.detectLan ...@@ -366,50 +366,35 @@ const landmarksByFace = await Promise.all(faceTensors.map(t => faceapi.detectLan
faceTensors.forEach(t => t.dispose()) faceTensors.forEach(t => t.dispose())
``` ```
<a name="usage-full-face-detection-and-recognition-pipeline"></a> <a name="shortcut-functions"></a>
### Full Face Detection and Recognition Pipeline ### Shortcut Functions for Full Face Description
After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows: After face detection has been performed, I would recommend to align the bounding boxes of the detected faces before passing them to the face recognition net, which will make the computed face descriptor much more accurate. Fortunately, the api can do this for you under the hood by providing convenient shortcut functions. You can obtain the full face descriptions (location, landmarks and descriptor) of each face in an input image as follows.
``` javascript Using the SSD Mobilenet v1 face detector + 68 point face landmark detector:
const fullFaceDescriptions = await faceapi.allFaces(input, minConfidence)
const fullFaceDescription0 = fullFaceDescriptions[0]
console.log(fullFaceDescription0.detection) // bounding box & score
console.log(fullFaceDescription0.landmarks) // 68 point face landmarks
console.log(fullFaceDescription0.descriptor) // face descriptor
``` javascript
const fullFaceDescriptions = await faceapi.allFacesSsdMobilenetv1(input, minConfidence)
``` ```
You can also do everything manually as shown in the following: Using the Tiny Yolo v2 face detector + 68 point face landmark detector:
``` javascript ``` javascript
// first detect the face locations const fullFaceDescriptions = await faceapi.allFacesTinyYolov2(input, { inputSize: 'md' })
const detections = await faceapi.locateFaces(input, minConfidence) ```
// get the face tensors from the image (have to be disposed manually)
const faceTensors = (await faceapi.extractFaceTensors(input, detections))
// detect landmarks and get the aligned face image bounding boxes Or with MTCNN face detection + 5 point face landmarks:
const alignedFaceBoxes = await Promise.all(faceTensors.map(
async (faceTensor, i) => {
const faceLandmarks = await faceapi.detectLandmarks(faceTensor)
return faceLandmarks.align(detections[i])
}
))
// free memory for face image tensors after we detected the face landmarks ``` javascript
faceTensors.forEach(t => t.dispose()) const fullFaceDescriptions = await faceapi.allFacesMtcnn(input, { minFaceSize: 20 })
```
// get the face tensors for the aligned face images from the image (have to be disposed manually)
const alignedFaceTensors = (await faceapi.extractFaceTensors(input, alignedFaceBoxes))
// compute the face descriptors from the aligned face images The shortcut functions return an array of FullFaceDescriptions:
const descriptors = await Promise.all(alignedFaceTensors.map(
faceTensor => faceapi.computeFaceDescriptor(faceTensor)
))
// free memory for face image tensors after we computed their descriptors ``` javascript
alignedFaceTensors.forEach(t => t.dispose()) const fullFaceDescription0 = fullFaceDescriptions[0]
``` console.log(fullFaceDescription0.detection) // bounding box & score
console.log(fullFaceDescription0.landmarks) // face landmarks
console.log(fullFaceDescription0.descriptor) // face descriptor
```
\ No newline at end of file
...@@ -147,6 +147,10 @@ function renderNavBar(navbarId, exampleUri) { ...@@ -147,6 +147,10 @@ function renderNavBar(navbarId, exampleUri) {
name: 'Tiny Yolov2 Face Detection Webcam' name: 'Tiny Yolov2 Face Detection Webcam'
}, },
{ {
uri: 'tiny_yolov2_face_recognition',
name: 'Tiny Yolov2 Face Recognition'
},
{
uri: 'batch_face_landmarks', uri: 'batch_face_landmarks',
name: 'Batch Face Landmarks' name: 'Batch Face Landmarks'
}, },
......
...@@ -32,6 +32,7 @@ app.get('/mtcnn_face_recognition_webcam', (req, res) => res.sendFile(path.join(v ...@@ -32,6 +32,7 @@ app.get('/mtcnn_face_recognition_webcam', (req, res) => res.sendFile(path.join(v
app.get('/tiny_yolov2_face_detection', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetection.html'))) app.get('/tiny_yolov2_face_detection', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetection.html')))
app.get('/tiny_yolov2_face_detection_video', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionVideo.html'))) app.get('/tiny_yolov2_face_detection_video', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionVideo.html')))
app.get('/tiny_yolov2_face_detection_webcam', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionWebcam.html'))) app.get('/tiny_yolov2_face_detection_webcam', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceDetectionWebcam.html')))
app.get('/tiny_yolov2_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'tinyYolov2FaceRecognition.html')))
app.get('/batch_face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceLandmarks.html'))) app.get('/batch_face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceLandmarks.html')))
app.get('/batch_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceRecognition.html'))) app.get('/batch_face_recognition', (req, res) => res.sendFile(path.join(viewsDir, 'batchFaceRecognition.html')))
......
...@@ -53,6 +53,15 @@ ...@@ -53,6 +53,15 @@
let modelLoaded = false let modelLoaded = false
let result let result
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() { function onIncreaseThreshold() {
minConfidence = Math.min(faceapi.round(minConfidence + 0.1), 1.0) minConfidence = Math.min(faceapi.round(minConfidence + 0.1), 1.0)
$('#minConfidence').val(minConfidence) $('#minConfidence').val(minConfidence)
...@@ -63,11 +72,6 @@ ...@@ -63,11 +72,6 @@
$('#minConfidence').val(minConfidence) $('#minConfidence').val(minConfidence)
} }
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) { async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded) if(videoEl.paused || videoEl.ended || !modelLoaded)
return false return false
...@@ -79,7 +83,7 @@ ...@@ -79,7 +83,7 @@
const ts = Date.now() const ts = Date.now()
result = await faceapi.locateFaces(videoEl, minConfidence) result = await faceapi.locateFaces(videoEl, minConfidence)
displayTimeStats(Date.now() - ts) updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height))) faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl)) setTimeout(() => onPlay(videoEl))
......
...@@ -63,6 +63,15 @@ ...@@ -63,6 +63,15 @@
let sizeType = 'md' let sizeType = 'md'
let modelLoaded = false let modelLoaded = false
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() { function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0) scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold) $('#scoreThreshold').val(scoreThreshold)
...@@ -78,11 +87,6 @@ ...@@ -78,11 +87,6 @@
$('#sizeType').val(sizeType) $('#sizeType').val(sizeType)
} }
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) { async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded) if(videoEl.paused || videoEl.ended || !modelLoaded)
return false return false
...@@ -99,7 +103,7 @@ ...@@ -99,7 +103,7 @@
const ts = Date.now() const ts = Date.now()
result = await faceapi.tinyYolov2(videoEl, forwardParams) result = await faceapi.tinyYolov2(videoEl, forwardParams)
displayTimeStats(Date.now() - ts) updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height))) faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl)) setTimeout(() => onPlay(videoEl))
......
...@@ -64,6 +64,15 @@ ...@@ -64,6 +64,15 @@
let sizeType = '160' let sizeType = '160'
let modelLoaded = false let modelLoaded = false
let forwardTimes = []
function updateTimeStats(timeInMs) {
forwardTimes = [timeInMs].concat(forwardTimes).slice(0, 30)
const avgTimeInMs = forwardTimes.reduce((total, t) => total + t) / forwardTimes.length
$('#time').val(`${Math.round(avgTimeInMs)} ms`)
$('#fps').val(`${faceapi.round(1000 / avgTimeInMs)}`)
}
function onIncreaseThreshold() { function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0) scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold) $('#scoreThreshold').val(scoreThreshold)
...@@ -79,11 +88,6 @@ ...@@ -79,11 +88,6 @@
$('#sizeType').val(sizeType) $('#sizeType').val(sizeType)
} }
function displayTimeStats(timeInMs) {
$('#time').val(`${timeInMs} ms`)
$('#fps').val(`${faceapi.round(1000 / timeInMs)}`)
}
async function onPlay(videoEl) { async function onPlay(videoEl) {
if(videoEl.paused || videoEl.ended || !modelLoaded) if(videoEl.paused || videoEl.ended || !modelLoaded)
return false return false
...@@ -100,7 +104,7 @@ ...@@ -100,7 +104,7 @@
const ts = Date.now() const ts = Date.now()
result = await faceapi.tinyYolov2(videoEl, forwardParams) result = await faceapi.tinyYolov2(videoEl, forwardParams)
displayTimeStats(Date.now() - ts) updateTimeStats(Date.now() - ts)
faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height))) faceapi.drawDetection('overlay', result.map(det => det.forSize(width, height)))
setTimeout(() => onPlay(videoEl)) setTimeout(() => onPlay(videoEl))
...@@ -124,7 +128,7 @@ ...@@ -124,7 +128,7 @@
} }
$(document).ready(function() { $(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_detection_video') renderNavBar('#navbar', 'tiny_yolov2_face_detection_webcam')
const sizeTypeSelect = $('#sizeType') const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType) sizeTypeSelect.val(sizeType)
......
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<div id="navbar"></div>
<div class="center-content page-container">
<div class="progress" id="loader">
<div class="indeterminate"></div>
</div>
<div style="position: relative" class="margin">
<img id="inputImg" src="" style="max-width: 800px;" />
<canvas id="overlay" />
</div>
<div class="row side-by-side">
<div id="selectList"></div>
<div class="row">
<label for="imgUrlInput">Get image from URL:</label>
<input id="imgUrlInput" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="loadImageFromUrl()"
>
Ok
</button>
<p>
<input type="checkbox" id="useBatchProcessing" onchange="onChangeUseBatchProcessing(event)" />
<label for="useBatchProcessing">Use Batch Processing</label>
</p>
</div>
<div class="row side-by-side">
<div class="row input-field" style="margin-right: 20px;">
<select id="sizeType">
<option value="" disabled selected>Input Size:</option>
<option value="xs">XS: 224 x 224</option>
<option value="sm">SM: 320 x 320</option>
<option value="md">MD: 416 x 416</option>
<option value="lg">LG: 608 x 608</option>
</select>
<label>Input Size</label>
</div>
<div class="row">
<label for="scoreThreshold">Score Threshold:</label>
<input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="onDecreaseThreshold()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn"
onclick="onIncreaseThreshold()"
>
<i class="material-icons left">+</i>
</button>
</div>
<div class="row side-by-side">
<div class="row">
<label for="maxDistance">Max Descriptor Distance:</label>
<input disabled value="0.6" id="maxDistance" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn button-sm"
onclick="onDecreaseMaxDistance()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn button-sm"
onclick="onIncreaseMaxDistance()"
>
<i class="material-icons left">+</i>
</button>
</div>
</div>
<script>
let maxDistance = 0.6
let useBatchProcessing = false
let trainDescriptorsByClass = []
let scoreThreshold = 0.5
let sizeType = 'lg'
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onDecreaseThreshold() {
scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onSizeTypeChanged(e, c) {
sizeType = e.target.value
$('#sizeType').val(sizeType)
updateResults()
}
function onChangeUseBatchProcessing(e) {
useBatchProcessing = $(e.target).prop('checked')
}
function onIncreaseMaxDistance() {
maxDistance = Math.min(faceapi.round(maxDistance + 0.1), 1.0)
$('#maxDistance').val(maxDistance)
updateResults()
}
function onDecreaseMaxDistance() {
maxDistance = Math.max(faceapi.round(maxDistance - 0.1), 0.1)
$('#maxDistance').val(maxDistance)
updateResults()
}
async function loadImageFromUrl(url) {
const img = await requestExternalImage($('#imgUrlInput').val())
$('#inputImg').get(0).src = img.src
updateResults()
}
async function updateResults() {
const inputImgEl = $('#inputImg').get(0)
const { width, height } = inputImgEl
const canvas = $('#overlay').get(0)
canvas.width = width
canvas.height = height
const forwardParams = {
inputSize: sizeType,
scoreThreshold
}
const fullFaceDescriptions = (await faceapi.allFacesTinyYolov2(inputImgEl, forwardParams, useBatchProcessing))
.map(fd => fd.forSize(width, height))
fullFaceDescriptions.forEach(({ detection, descriptor }) => {
faceapi.drawDetection('overlay', [detection], { withScore: false })
const bestMatch = getBestMatch(trainDescriptorsByClass, descriptor)
const text = `${bestMatch.distance < maxDistance ? bestMatch.className : 'unkown'} (${bestMatch.distance})`
const { x, y, height: boxHeight } = detection.getBox()
faceapi.drawText(
canvas.getContext('2d'),
x,
y + boxHeight,
text,
Object.assign(faceapi.getDefaultDrawOptions(), { color: 'red', fontSize: 16 })
)
})
}
async function onSelectionChanged(uri) {
const imgBuf = await fetchImage(uri)
$(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
updateResults()
}
async function run() {
await faceapi.loadTinyYolov2Model('/')
await faceapi.loadFaceLandmarkModel('/')
await faceapi.loadFaceRecognitionModel('/')
trainDescriptorsByClass = await initTrainDescriptorsByClass(faceapi.recognitionNet, 1)
$('#loader').hide()
onSelectionChanged($('#selectList select').val())
}
$(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_recognition')
renderImageSelectList(
'#selectList',
async (uri) => {
await onSelectionChanged(uri)
},
'bbt1.jpg'
)
const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType)
sizeTypeSelect.on('change', onSizeTypeChanged)
sizeTypeSelect.material_select()
run()
})
</script>
</body>
</html>
\ No newline at end of file
...@@ -4,7 +4,8 @@ const dataFiles = [ ...@@ -4,7 +4,8 @@ const dataFiles = [
'test/data/*.json', 'test/data/*.json',
'test/media/*.mp4', 'test/media/*.mp4',
'weights/**/*', 'weights/**/*',
'weights_uncompressed/**/*' 'weights_uncompressed/**/*',
'weights_unused/**/*'
].map(pattern => ({ ].map(pattern => ({
pattern, pattern,
watched: false, watched: false,
......
This source diff could not be displayed because it is too large. You can view the blob instead.
import { Rect } from './Rect'; import { Rect } from './Rect';
import { Dimensions } from './types';
import { isDimensions } from './utils';
export class BoundingBox { export class BoundingBox {
constructor( constructor(
...@@ -33,6 +35,10 @@ export class BoundingBox { ...@@ -33,6 +35,10 @@ export class BoundingBox {
return this.bottom - this.top return this.bottom - this.top
} }
public get area() : number {
return this.width * this.height
}
public toSquare(): BoundingBox { public toSquare(): BoundingBox {
let { left, top, right, bottom } = this let { left, top, right, bottom } = this
...@@ -100,6 +106,12 @@ export class BoundingBox { ...@@ -100,6 +106,12 @@ export class BoundingBox {
).toSquare().round() ).toSquare().round()
} }
public rescale(s: Dimensions | number) {
const scaleX = isDimensions(s) ? (s as Dimensions).width : s as number
const scaleY = isDimensions(s) ? (s as Dimensions).height : s as number
return new BoundingBox(this.left * scaleX, this.top * scaleY, this.right * scaleX, this.bottom * scaleY)
}
public toRect(): Rect { public toRect(): Rect {
return new Rect(this.left, this.top, this.width, this.height) return new Rect(this.left, this.top, this.width, this.height)
} }
......
...@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core'; ...@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
import { isTensor3D, isTensor4D } from './commons/isTensor'; import { isTensor3D, isTensor4D } from './commons/isTensor';
import { padToSquare } from './padToSquare'; import { padToSquare } from './padToSquare';
import { Point } from './Point'; import { Point } from './Point';
import { TResolvedNetInput } from './types'; import { TResolvedNetInput, Dimensions } from './types';
import { createCanvasFromMedia } from './utils'; import { createCanvasFromMedia } from './utils';
export class NetInput { export class NetInput {
...@@ -14,6 +14,7 @@ export class NetInput { ...@@ -14,6 +14,7 @@ export class NetInput {
private _inputDimensions: number[][] = [] private _inputDimensions: number[][] = []
private _paddings: Point[] = [] private _paddings: Point[] = []
private _inputSize: number = 0
constructor( constructor(
inputs: tf.Tensor4D | Array<TResolvedNetInput>, inputs: tf.Tensor4D | Array<TResolvedNetInput>,
...@@ -81,6 +82,22 @@ export class NetInput { ...@@ -81,6 +82,22 @@ export class NetInput {
return this._paddings return this._paddings
} }
public get inputSize(): number {
return this._inputSize
}
public get relativePaddings(): Point[] {
return Array(this.inputs.length).fill(0).map(
(_, batchIdx) => this.getRelativePaddings(batchIdx)
)
}
public get reshapedInputDimensions(): Dimensions[] {
return Array(this.inputs.length).fill(0).map(
(_, batchIdx) => this.getReshapedInputDimensions(batchIdx)
)
}
public getInputDimensions(batchIdx: number): number[] { public getInputDimensions(batchIdx: number): number[] {
return this._inputDimensions[batchIdx] return this._inputDimensions[batchIdx]
} }
...@@ -97,8 +114,26 @@ export class NetInput { ...@@ -97,8 +114,26 @@ export class NetInput {
return this._paddings[batchIdx] return this._paddings[batchIdx]
} }
public getRelativePaddings(batchIdx: number): Point {
return new Point(
(this.getPaddings(batchIdx).x + this.getInputWidth(batchIdx)) / this.getInputWidth(batchIdx),
(this.getPaddings(batchIdx).y + this.getInputHeight(batchIdx)) / this.getInputHeight(batchIdx)
)
}
public getReshapedInputDimensions(batchIdx: number): Dimensions {
const [h, w] = [this.getInputHeight(batchIdx), this.getInputWidth(batchIdx)]
const f = this.inputSize / Math.max(h, w)
return {
height: Math.floor(h * f),
width: Math.floor(w * f)
}
}
public toBatchTensor(inputSize: number, isCenterInputs: boolean = true): tf.Tensor4D { public toBatchTensor(inputSize: number, isCenterInputs: boolean = true): tf.Tensor4D {
this._inputSize = inputSize
return tf.tidy(() => { return tf.tidy(() => {
const inputTensors = this._inputs.map((inputTensor: tf.Tensor3D) => { const inputTensors = this._inputs.map((inputTensor: tf.Tensor3D) => {
...@@ -131,7 +166,7 @@ export class NetInput { ...@@ -131,7 +166,7 @@ export class NetInput {
} }
/** /**
* By setting the isManaged flag, all newly created tensors will be automatically * By setting the isManaged flag, all newly created tensors will be
* automatically disposed after the batch tensor has been created * automatically disposed after the batch tensor has been created
*/ */
public managed() { public managed() {
......
import { BoundingBox } from './BoundingBox';
export interface IRect { export interface IRect {
x: number x: number
y: number y: number
...@@ -54,6 +55,10 @@ export class Rect implements IRect { ...@@ -54,6 +55,10 @@ export class Rect implements IRect {
) )
} }
public toBoundingBox(): BoundingBox {
return new BoundingBox(this.x, this.y, this.x + this.width, this.y + this.height)
}
public clipAtImageBorders(imgWidth: number, imgHeight: number): Rect { public clipAtImageBorders(imgWidth: number, imgHeight: number): Rect {
const { x, y, right, bottom } = this const { x, y, right, bottom } = this
const clippedX = Math.max(x, 0) const clippedX = Math.max(x, 0)
......
import { TinyYolov2 } from '.';
import { extractFaceTensors } from './extractFaceTensors'; import { extractFaceTensors } from './extractFaceTensors';
import { FaceDetection } from './FaceDetection';
import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet'; import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet'; import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
import { FaceLandmarks68 } from './faceLandmarkNet/FaceLandmarks68'; import { FaceLandmarks68 } from './faceLandmarkNet/FaceLandmarks68';
...@@ -7,6 +9,7 @@ import { FullFaceDescription } from './FullFaceDescription'; ...@@ -7,6 +9,7 @@ import { FullFaceDescription } from './FullFaceDescription';
import { Mtcnn } from './mtcnn/Mtcnn'; import { Mtcnn } from './mtcnn/Mtcnn';
import { MtcnnForwardParams } from './mtcnn/types'; import { MtcnnForwardParams } from './mtcnn/types';
import { Rect } from './Rect'; import { Rect } from './Rect';
import { TinyYolov2ForwardParams } from './tinyYolov2/types';
import { TNetInput } from './types'; import { TNetInput } from './types';
function computeDescriptorsFactory( function computeDescriptorsFactory(
...@@ -27,8 +30,8 @@ function computeDescriptorsFactory( ...@@ -27,8 +30,8 @@ function computeDescriptorsFactory(
} }
} }
export function allFacesFactory( function allFacesFactory(
detectionNet: FaceDetectionNet, detectFaces: (input: TNetInput) => Promise<FaceDetection[]>,
landmarkNet: FaceLandmarkNet, landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet recognitionNet: FaceRecognitionNet
) { ) {
...@@ -36,11 +39,10 @@ export function allFacesFactory( ...@@ -36,11 +39,10 @@ export function allFacesFactory(
return async function( return async function(
input: TNetInput, input: TNetInput,
minConfidence: number = 0.8,
useBatchProcessing: boolean = false useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> { ): Promise<FullFaceDescription[]> {
const detections = await detectionNet.locateFaces(input, minConfidence) const detections = await detectFaces(input)
const faceTensors = await extractFaceTensors(input, detections) const faceTensors = await extractFaceTensors(input, detections)
const faceLandmarksByFace = useBatchProcessing const faceLandmarksByFace = useBatchProcessing
...@@ -68,6 +70,38 @@ export function allFacesFactory( ...@@ -68,6 +70,38 @@ export function allFacesFactory(
} }
} }
export function allFacesSsdMobilenetv1Factory(
ssdMobilenetv1: FaceDetectionNet,
landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet
) {
return async function(
input: TNetInput,
minConfidence: number = 0.8,
useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> {
const detectFaces = (input: TNetInput) => ssdMobilenetv1.locateFaces(input, minConfidence)
const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
return allFaces(input, useBatchProcessing)
}
}
export function allFacesTinyYolov2Factory(
tinyYolov2: TinyYolov2,
landmarkNet: FaceLandmarkNet,
recognitionNet: FaceRecognitionNet
) {
return async function(
input: TNetInput,
forwardParams: TinyYolov2ForwardParams = {},
useBatchProcessing: boolean = false
): Promise<FullFaceDescription[]> {
const detectFaces = (input: TNetInput) => tinyYolov2.locateFaces(input, forwardParams)
const allFaces = allFacesFactory(detectFaces, landmarkNet, recognitionNet)
return allFaces(input, useBatchProcessing)
}
}
export function allFacesMtcnnFactory( export function allFacesMtcnnFactory(
mtcnn: Mtcnn, mtcnn: Mtcnn,
recognitionNet: FaceRecognitionNet recognitionNet: FaceRecognitionNet
......
import { BoundingBox } from '../BoundingBox'; import { BoundingBox } from '../BoundingBox';
import { iou } from '../iou';
export function nonMaxSuppression( export function nonMaxSuppression(
boxes: BoundingBox[], boxes: BoundingBox[],
...@@ -7,10 +8,6 @@ export function nonMaxSuppression( ...@@ -7,10 +8,6 @@ export function nonMaxSuppression(
isIOU: boolean = true isIOU: boolean = true
): number[] { ): number[] {
const areas = boxes.map(
box => (box.width + 1) * (box.height + 1)
)
let indicesSortedByScore = scores let indicesSortedByScore = scores
.map((score, boxIndex) => ({ score, boxIndex })) .map((score, boxIndex) => ({ score, boxIndex }))
.sort((c1, c2) => c1.score - c2.score) .sort((c1, c2) => c1.score - c2.score)
...@@ -31,15 +28,7 @@ export function nonMaxSuppression( ...@@ -31,15 +28,7 @@ export function nonMaxSuppression(
const currBox = boxes[curr] const currBox = boxes[curr]
const idxBox = boxes[idx] const idxBox = boxes[idx]
const width = Math.max(0.0, Math.min(currBox.right, idxBox.right) - Math.max(currBox.left, idxBox.left) + 1) outputs.push(iou(currBox, idxBox, isIOU))
const height = Math.max(0.0, Math.min(currBox.bottom, idxBox.bottom) - Math.max(currBox.top, idxBox.top) + 1)
const interSection = width * height
const out = isIOU
? interSection / (areas[curr] + areas[idx] - interSection)
: interSection / Math.min(areas[curr], areas[idx])
outputs.push(out)
} }
indicesSortedByScore = indicesSortedByScore.filter( indicesSortedByScore = indicesSortedByScore.filter(
......
import * as tf from '@tensorflow/tfjs-core';
export function normalize(x: tf.Tensor4D, meanRgb: number[]): tf.Tensor4D {
return tf.tidy(() => {
const [r, g, b] = meanRgb
const avg_r = tf.fill([...x.shape.slice(0, 3), 1], r)
const avg_g = tf.fill([...x.shape.slice(0, 3), 1], g)
const avg_b = tf.fill([...x.shape.slice(0, 3), 1], b)
const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
return tf.sub(x, avg_rgb)
})
}
\ No newline at end of file
...@@ -81,19 +81,18 @@ export class FaceDetectionNet extends NeuralNetwork<NetParams> { ...@@ -81,19 +81,18 @@ export class FaceDetectionNet extends NeuralNetwork<NetParams> {
minConfidence minConfidence
) )
const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0) const paddings = netInput.getRelativePaddings(0)
const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
const results = indices const results = indices
.map(idx => { .map(idx => {
const [top, bottom] = [ const [top, bottom] = [
Math.max(0, boxes.get(idx, 0)), Math.max(0, boxes.get(idx, 0)),
Math.min(1.0, boxes.get(idx, 2)) Math.min(1.0, boxes.get(idx, 2))
].map(val => val * paddedHeightRelative) ].map(val => val * paddings.y)
const [left, right] = [ const [left, right] = [
Math.max(0, boxes.get(idx, 1)), Math.max(0, boxes.get(idx, 1)),
Math.min(1.0, boxes.get(idx, 3)) Math.min(1.0, boxes.get(idx, 3))
].map(val => val * paddedWidthRelative) ].map(val => val * paddings.x)
return new FaceDetection( return new FaceDetection(
scoresData[idx], scoresData[idx],
new Rect( new Rect(
......
import * as tf from '@tensorflow/tfjs-core'; import * as tf from '@tensorflow/tfjs-core';
import { NeuralNetwork } from '../commons/NeuralNetwork'; import { NeuralNetwork } from '../commons/NeuralNetwork';
import { normalize } from '../commons/normalize';
import { NetInput } from '../NetInput'; import { NetInput } from '../NetInput';
import { toNetInput } from '../toNetInput'; import { toNetInput } from '../toNetInput';
import { TNetInput } from '../types'; import { TNetInput } from '../types';
import { convDown } from './convLayer'; import { convDown } from './convLayer';
import { extractParams } from './extractParams'; import { extractParams } from './extractParams';
import { loadQuantizedParams } from './loadQuantizedParams'; import { loadQuantizedParams } from './loadQuantizedParams';
import { normalize } from './normalize';
import { residual, residualDown } from './residualLayer'; import { residual, residualDown } from './residualLayer';
import { NetParams } from './types'; import { NetParams } from './types';
...@@ -28,7 +28,8 @@ export class FaceRecognitionNet extends NeuralNetwork<NetParams> { ...@@ -28,7 +28,8 @@ export class FaceRecognitionNet extends NeuralNetwork<NetParams> {
return tf.tidy(() => { return tf.tidy(() => {
const batchTensor = input.toBatchTensor(150, true) const batchTensor = input.toBatchTensor(150, true)
const normalized = normalize(batchTensor) const meanRgb = [122.782, 117.001, 104.298]
const normalized = normalize(batchTensor, meanRgb).div(tf.scalar(256)) as tf.Tensor4D
let out = convDown(normalized, params.conv32_down) let out = convDown(normalized, params.conv32_down)
out = tf.maxPool(out, 3, 2, 'valid') out = tf.maxPool(out, 3, 2, 'valid')
......
import * as tf from '@tensorflow/tfjs-core';
export function normalize(x: tf.Tensor4D): tf.Tensor4D {
return tf.tidy(() => {
const avg_r = tf.fill([...x.shape.slice(0, 3), 1], 122.782);
const avg_g = tf.fill([...x.shape.slice(0, 3), 1], 117.001);
const avg_b = tf.fill([...x.shape.slice(0, 3), 1], 104.298);
const avg_rgb = tf.concat([avg_r, avg_g, avg_b], 3)
return tf.div(tf.sub(x, avg_rgb), tf.scalar(256))
})
}
\ No newline at end of file
import * as tf from '@tensorflow/tfjs-core'; import * as tf from '@tensorflow/tfjs-core';
import { allFacesFactory, allFacesMtcnnFactory } from './allFacesFactory'; import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from './allFacesFactory';
import { FaceDetection } from './FaceDetection'; import { FaceDetection } from './FaceDetection';
import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet'; import { FaceDetectionNet } from './faceDetectionNet/FaceDetectionNet';
import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet'; import { FaceLandmarkNet } from './faceLandmarkNet/FaceLandmarkNet';
...@@ -21,15 +21,15 @@ export const recognitionNet = new FaceRecognitionNet() ...@@ -21,15 +21,15 @@ export const recognitionNet = new FaceRecognitionNet()
// nets need more specific names, to avoid ambiguity in future // nets need more specific names, to avoid ambiguity in future
// when alternative net implementations are provided // when alternative net implementations are provided
export const nets = { export const nets = {
ssdMobilenet: detectionNet, ssdMobilenetv1: detectionNet,
faceLandmark68Net: landmarkNet, faceLandmark68Net: landmarkNet,
faceRecognitionNet: recognitionNet, faceRecognitionNet: recognitionNet,
mtcnn: new Mtcnn(), mtcnn: new Mtcnn(),
tinyYolov2: new TinyYolov2() tinyYolov2: new TinyYolov2()
} }
export function loadFaceDetectionModel(url: string) { export function loadSsdMobilenetv1Model(url: string) {
return nets.ssdMobilenet.load(url) return nets.ssdMobilenetv1.load(url)
} }
export function loadFaceLandmarkModel(url: string) { export function loadFaceLandmarkModel(url: string) {
...@@ -48,9 +48,13 @@ export function loadTinyYolov2Model(url: string) { ...@@ -48,9 +48,13 @@ export function loadTinyYolov2Model(url: string) {
return nets.tinyYolov2.load(url) return nets.tinyYolov2.load(url)
} }
export function loadFaceDetectionModel(url: string) {
return loadSsdMobilenetv1Model(url)
}
export function loadModels(url: string) { export function loadModels(url: string) {
return Promise.all([ return Promise.all([
loadFaceDetectionModel(url), loadSsdMobilenetv1Model(url),
loadFaceLandmarkModel(url), loadFaceLandmarkModel(url),
loadFaceRecognitionModel(url), loadFaceRecognitionModel(url),
loadMtcnnModel(url), loadMtcnnModel(url),
...@@ -63,7 +67,7 @@ export function locateFaces( ...@@ -63,7 +67,7 @@ export function locateFaces(
minConfidence?: number, minConfidence?: number,
maxResults?: number maxResults?: number
): Promise<FaceDetection[]> { ): Promise<FaceDetection[]> {
return nets.ssdMobilenet.locateFaces(input, minConfidence, maxResults) return nets.ssdMobilenetv1.locateFaces(input, minConfidence, maxResults)
} }
export function detectLandmarks( export function detectLandmarks(
...@@ -92,14 +96,26 @@ export function tinyYolov2( ...@@ -92,14 +96,26 @@ export function tinyYolov2(
return nets.tinyYolov2.locateFaces(input, forwardParams) return nets.tinyYolov2.locateFaces(input, forwardParams)
} }
export type allFacesFunction = ( export type allFacesSsdMobilenetv1Function = (
input: tf.Tensor | NetInput | TNetInput, input: tf.Tensor | NetInput | TNetInput,
minConfidence?: number, minConfidence?: number,
useBatchProcessing?: boolean useBatchProcessing?: boolean
) => Promise<FullFaceDescription[]> ) => Promise<FullFaceDescription[]>
export const allFaces: allFacesFunction = allFacesFactory( export const allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function = allFacesSsdMobilenetv1Factory(
nets.ssdMobilenet, nets.ssdMobilenetv1,
nets.faceLandmark68Net,
nets.faceRecognitionNet
)
export type allFacesTinyYolov2Function = (
input: tf.Tensor | NetInput | TNetInput,
forwardParams?: TinyYolov2ForwardParams,
useBatchProcessing?: boolean
) => Promise<FullFaceDescription[]>
export const allFacesTinyYolov2: allFacesTinyYolov2Function = allFacesTinyYolov2Factory(
nets.tinyYolov2,
nets.faceLandmark68Net, nets.faceLandmark68Net,
nets.faceRecognitionNet nets.faceRecognitionNet
) )
...@@ -113,4 +129,6 @@ export type allFacesMtcnnFunction = ( ...@@ -113,4 +129,6 @@ export type allFacesMtcnnFunction = (
export const allFacesMtcnn: allFacesMtcnnFunction = allFacesMtcnnFactory( export const allFacesMtcnn: allFacesMtcnnFunction = allFacesMtcnnFactory(
nets.mtcnn, nets.mtcnn,
nets.faceRecognitionNet nets.faceRecognitionNet
) )
\ No newline at end of file
export const allFaces = allFacesSsdMobilenetv1
\ No newline at end of file
...@@ -5,6 +5,7 @@ export { ...@@ -5,6 +5,7 @@ export {
} }
export * from './BoundingBox';
export * from './FaceDetection'; export * from './FaceDetection';
export * from './FullFaceDescription'; export * from './FullFaceDescription';
export * from './NetInput'; export * from './NetInput';
...@@ -19,8 +20,11 @@ export * from './faceDetectionNet'; ...@@ -19,8 +20,11 @@ export * from './faceDetectionNet';
export * from './faceLandmarkNet'; export * from './faceLandmarkNet';
export * from './faceRecognitionNet'; export * from './faceRecognitionNet';
export * from './globalApi'; export * from './globalApi';
export * from './iou';
export * from './mtcnn'; export * from './mtcnn';
export * from './padToSquare'; export * from './padToSquare';
export * from './tinyYolov2'; export * from './tinyYolov2';
export * from './toNetInput'; export * from './toNetInput';
export * from './utils' export * from './utils';
\ No newline at end of file
export * from './types';
\ No newline at end of file
import { BoundingBox } from './BoundingBox';
export function iou(box1: BoundingBox, box2: BoundingBox, isIOU: boolean = true) {
const width = Math.max(0.0, Math.min(box1.right, box2.right) - Math.max(box1.left, box2.left) + 1)
const height = Math.max(0.0, Math.min(box1.bottom, box2.bottom) - Math.max(box1.top, box2.top) + 1)
const interSection = width * height
return isIOU
? interSection / (box1.area + box2.area - interSection)
: interSection / Math.min(box1.area, box2.area)
}
\ No newline at end of file
...@@ -4,21 +4,37 @@ import { BoundingBox } from '../BoundingBox'; ...@@ -4,21 +4,37 @@ import { BoundingBox } from '../BoundingBox';
import { convLayer } from '../commons/convLayer'; import { convLayer } from '../commons/convLayer';
import { NeuralNetwork } from '../commons/NeuralNetwork'; import { NeuralNetwork } from '../commons/NeuralNetwork';
import { nonMaxSuppression } from '../commons/nonMaxSuppression'; import { nonMaxSuppression } from '../commons/nonMaxSuppression';
import { normalize } from '../commons/normalize';
import { FaceDetection } from '../FaceDetection'; import { FaceDetection } from '../FaceDetection';
import { NetInput } from '../NetInput'; import { NetInput } from '../NetInput';
import { Point } from '../Point';
import { toNetInput } from '../toNetInput'; import { toNetInput } from '../toNetInput';
import { TNetInput } from '../types'; import { TNetInput } from '../types';
import { BOX_ANCHORS, INPUT_SIZES, IOU_THRESHOLD, NUM_BOXES } from './config'; import { sigmoid } from '../utils';
import { BOX_ANCHORS, BOX_ANCHORS_SEPARABLE, INPUT_SIZES, IOU_THRESHOLD, MEAN_RGB, NUM_BOXES } from './config';
import { convWithBatchNorm } from './convWithBatchNorm'; import { convWithBatchNorm } from './convWithBatchNorm';
import { extractParams } from './extractParams'; import { extractParams } from './extractParams';
import { getDefaultParams } from './getDefaultParams'; import { getDefaultParams } from './getDefaultParams';
import { loadQuantizedParams } from './loadQuantizedParams'; import { loadQuantizedParams } from './loadQuantizedParams';
import { NetParams, TinyYolov2ForwardParams } from './types'; import { NetParams, PostProcessingParams, TinyYolov2ForwardParams } from './types';
export class TinyYolov2 extends NeuralNetwork<NetParams> { export class TinyYolov2 extends NeuralNetwork<NetParams> {
constructor() { private _withSeparableConvs: boolean
private _anchors: Point[]
constructor(withSeparableConvs: boolean = true) {
super('TinyYolov2') super('TinyYolov2')
this._withSeparableConvs = withSeparableConvs
this._anchors = withSeparableConvs ? BOX_ANCHORS_SEPARABLE : BOX_ANCHORS
}
public get withSeparableConvs(): boolean {
return this._withSeparableConvs
}
public get anchors(): Point[] {
return this._anchors
} }
public forwardInput(input: NetInput, inputSize: number): tf.Tensor4D { public forwardInput(input: NetInput, inputSize: number): tf.Tensor4D {
...@@ -30,7 +46,12 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> { ...@@ -30,7 +46,12 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
} }
const out = tf.tidy(() => { const out = tf.tidy(() => {
const batchTensor = input.toBatchTensor(inputSize, false).div(tf.scalar(255)).toFloat() as tf.Tensor4D
let batchTensor = input.toBatchTensor(inputSize, false)
batchTensor = this.withSeparableConvs
? normalize(batchTensor, MEAN_RGB)
: batchTensor
batchTensor = batchTensor.div(tf.scalar(256)) as tf.Tensor4D
let out = convWithBatchNorm(batchTensor, params.conv0) let out = convWithBatchNorm(batchTensor, params.conv0)
out = tf.maxPool(out, [2, 2], [2, 2], 'same') out = tf.maxPool(out, [2, 2], [2, 2], 'same')
...@@ -67,44 +88,78 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> { ...@@ -67,44 +88,78 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
: _inputSize : _inputSize
if (typeof inputSize !== 'number') { if (typeof inputSize !== 'number') {
throw new Error(`TinyYolov2 - unkown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`) throw new Error(`TinyYolov2 - unknown inputSize: ${inputSize}, expected number or one of xs | sm | md | lg`)
} }
const netInput = await toNetInput(input, true) const netInput = await toNetInput(input, true)
const out = await this.forwardInput(netInput, inputSize) const out = await this.forwardInput(netInput, inputSize)
const numCells = out.shape[1] const out0 = tf.tidy(() => tf.unstack(out)[0].expandDims()) as tf.Tensor4D
const inputDimensions = {
width: netInput.getInputWidth(0),
height: netInput.getInputHeight(0)
}
const paddings = netInput.getRelativePaddings(0)
const results = this.postProcess(out0, { scoreThreshold, paddings })
const boxes = results.map(res => res.box)
const scores = results.map(res => res.score)
out.dispose()
out0.dispose()
const indices = nonMaxSuppression(
boxes.map(box => box.rescale(inputSize)),
scores,
IOU_THRESHOLD,
true
)
const detections = indices.map(idx =>
new FaceDetection(
scores[idx],
boxes[idx].toRect(),
inputDimensions
)
)
return detections
}
public postProcess(outputTensor: tf.Tensor4D, { scoreThreshold, paddings }: PostProcessingParams) {
const numCells = outputTensor.shape[1]
const [boxesTensor, scoresTensor] = tf.tidy(() => { const [boxesTensor, scoresTensor] = tf.tidy(() => {
const reshaped = out.reshape([numCells, numCells, NUM_BOXES, 6]) const reshaped = outputTensor.reshape([numCells, numCells, NUM_BOXES, this.withSeparableConvs ? 5 : 6])
out.dispose()
const boxes = reshaped.slice([0, 0, 0, 0], [numCells, numCells, NUM_BOXES, 4]) const boxes = reshaped.slice([0, 0, 0, 0], [numCells, numCells, NUM_BOXES, 4])
const scores = reshaped.slice([0, 0, 0, 4], [numCells, numCells, NUM_BOXES, 1]) const scores = reshaped.slice([0, 0, 0, 4], [numCells, numCells, NUM_BOXES, 1])
return [boxes, scores] return [boxes, scores]
}) })
const expit = (x: number): number => 1 / (1 + Math.exp(-x)) const results = []
const paddedHeightRelative = (netInput.getPaddings(0).y + netInput.getInputHeight(0)) / netInput.getInputHeight(0)
const paddedWidthRelative = (netInput.getPaddings(0).x + netInput.getInputWidth(0)) / netInput.getInputWidth(0)
const boxes: BoundingBox[] = []
const scores: number[] = []
for (let row = 0; row < numCells; row ++) { for (let row = 0; row < numCells; row ++) {
for (let col = 0; col < numCells; col ++) { for (let col = 0; col < numCells; col ++) {
for (let box = 0; box < NUM_BOXES; box ++) { for (let anchor = 0; anchor < NUM_BOXES; anchor ++) {
const score = expit(scoresTensor.get(row, col, box, 0)) const score = sigmoid(scoresTensor.get(row, col, anchor, 0))
if (score > scoreThreshold) { if (!scoreThreshold || score > scoreThreshold) {
const ctX = ((col + expit(boxesTensor.get(row, col, box, 0))) / numCells) * paddedWidthRelative const ctX = ((col + sigmoid(boxesTensor.get(row, col, anchor, 0))) / numCells) * paddings.x
const ctY = ((row + expit(boxesTensor.get(row, col, box, 1))) / numCells) * paddedHeightRelative const ctY = ((row + sigmoid(boxesTensor.get(row, col, anchor, 1))) / numCells) * paddings.y
const width = ((Math.exp(boxesTensor.get(row, col, box, 2)) * BOX_ANCHORS[box].x) / numCells) * paddedWidthRelative const width = ((Math.exp(boxesTensor.get(row, col, anchor, 2)) * this.anchors[anchor].x) / numCells) * paddings.x
const height = ((Math.exp(boxesTensor.get(row, col, box, 3)) * BOX_ANCHORS[box].y) / numCells) * paddedHeightRelative const height = ((Math.exp(boxesTensor.get(row, col, anchor, 3)) * this.anchors[anchor].y) / numCells) * paddings.y
const x = (ctX - (width / 2)) const x = (ctX - (width / 2))
const y = (ctY - (height / 2)) const y = (ctY - (height / 2))
boxes.push(new BoundingBox(x, y, x + width, y + height)) results.push({
scores.push(score) box: new BoundingBox(x, y, x + width, y + height),
score,
row,
col,
anchor
})
} }
} }
} }
...@@ -113,34 +168,14 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> { ...@@ -113,34 +168,14 @@ export class TinyYolov2 extends NeuralNetwork<NetParams> {
boxesTensor.dispose() boxesTensor.dispose()
scoresTensor.dispose() scoresTensor.dispose()
const indices = nonMaxSuppression( return results
boxes.map(box => new BoundingBox(
box.left * inputSize,
box.top * inputSize,
box.right * inputSize,
box.bottom * inputSize
)),
scores,
IOU_THRESHOLD,
true
)
const detections = indices.map(idx =>
new FaceDetection(
scores[idx],
boxes[idx].toRect(),
{ width: netInput.getInputWidth(0), height: netInput.getInputHeight(0) }
)
)
return detections
} }
protected loadQuantizedParams(uri: string | undefined) { protected loadQuantizedParams(uri: string | undefined) {
return loadQuantizedParams(uri) return loadQuantizedParams(uri, this.withSeparableConvs)
} }
protected extractParams(weights: Float32Array) { protected extractParams(weights: Float32Array) {
return extractParams(weights) return extractParams(weights, this.withSeparableConvs)
} }
} }
\ No newline at end of file
...@@ -10,4 +10,14 @@ export const BOX_ANCHORS = [ ...@@ -10,4 +10,14 @@ export const BOX_ANCHORS = [
new Point(4.30971, 7.04493), new Point(4.30971, 7.04493),
new Point(10.246, 4.59428), new Point(10.246, 4.59428),
new Point(12.6868, 11.8741) new Point(12.6868, 11.8741)
] ]
\ No newline at end of file
export const BOX_ANCHORS_SEPARABLE = [
new Point(1.603231, 2.094468),
new Point(6.041143, 7.080126),
new Point(2.882459, 3.518061),
new Point(4.266906, 5.178857),
new Point(9.041765, 10.66308)
]
export const MEAN_RGB = [117.001, 114.697, 97.404]
\ No newline at end of file
import * as tf from '@tensorflow/tfjs-core'; import * as tf from '@tensorflow/tfjs-core';
import { leaky } from './leaky'; import { leaky } from './leaky';
import { ConvWithBatchNorm } from './types'; import { ConvWithBatchNorm, SeparableConvParams } from './types';
export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm): tf.Tensor4D { export function convWithBatchNorm(x: tf.Tensor4D, params: ConvWithBatchNorm | SeparableConvParams): tf.Tensor4D {
return tf.tidy(() => { return tf.tidy(() => {
let out = tf.pad(x, [[0, 0], [1, 1], [1, 1], [0, 0]]) as tf.Tensor4D let out = tf.pad(x, [[0, 0], [1, 1], [1, 1], [0, 0]]) as tf.Tensor4D
out = tf.conv2d(out, params.conv.filters, [1, 1], 'valid')
out = tf.sub(out, params.bn.sub) if (params instanceof SeparableConvParams) {
out = tf.mul(out, params.bn.truediv) out = tf.separableConv2d(out, params.depthwise_filter, params.pointwise_filter, [1, 1], 'valid')
out = tf.add(out, params.conv.bias) out = tf.add(out, params.bias)
} else {
out = tf.conv2d(out, params.conv.filters, [1, 1], 'valid')
out = tf.sub(out, params.bn.sub)
out = tf.mul(out, params.bn.truediv)
out = tf.add(out, params.conv.bias)
}
return leaky(out) return leaky(out)
}) })
} }
\ No newline at end of file
...@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core'; ...@@ -3,7 +3,7 @@ import * as tf from '@tensorflow/tfjs-core';
import { extractConvParamsFactory } from '../commons/extractConvParamsFactory'; import { extractConvParamsFactory } from '../commons/extractConvParamsFactory';
import { extractWeightsFactory } from '../commons/extractWeightsFactory'; import { extractWeightsFactory } from '../commons/extractWeightsFactory';
import { ExtractWeightsFunction, ParamMapping } from '../commons/types'; import { ExtractWeightsFunction, ParamMapping } from '../commons/types';
import { BatchNorm, ConvWithBatchNorm, NetParams } from './types'; import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';
function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings: ParamMapping[]) { function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings: ParamMapping[]) {
...@@ -30,14 +30,33 @@ function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings ...@@ -30,14 +30,33 @@ function extractorsFactory(extractWeights: ExtractWeightsFunction, paramMappings
return { conv, bn } return { conv, bn }
} }
function extractSeparableConvParams(channelsIn: number, channelsOut: number, mappedPrefix: string): SeparableConvParams {
const depthwise_filter = tf.tensor4d(extractWeights(3 * 3 * channelsIn), [3, 3, channelsIn, 1])
const pointwise_filter = tf.tensor4d(extractWeights(channelsIn * channelsOut), [1, 1, channelsIn, channelsOut])
const bias = tf.tensor1d(extractWeights(channelsOut))
paramMappings.push(
{ paramPath: `${mappedPrefix}/depthwise_filter` },
{ paramPath: `${mappedPrefix}/pointwise_filter` },
{ paramPath: `${mappedPrefix}/bias` }
)
return new SeparableConvParams(
depthwise_filter,
pointwise_filter,
bias
)
}
return { return {
extractConvParams, extractConvParams,
extractConvWithBatchNormParams extractConvWithBatchNormParams,
extractSeparableConvParams
} }
} }
export function extractParams(weights: Float32Array): { params: NetParams, paramMappings: ParamMapping[] } { export function extractParams(weights: Float32Array, withSeparableConvs: boolean): { params: NetParams, paramMappings: ParamMapping[] } {
const { const {
extractWeights, extractWeights,
...@@ -48,18 +67,22 @@ export function extractParams(weights: Float32Array): { params: NetParams, param ...@@ -48,18 +67,22 @@ export function extractParams(weights: Float32Array): { params: NetParams, param
const { const {
extractConvParams, extractConvParams,
extractConvWithBatchNormParams extractConvWithBatchNormParams,
extractSeparableConvParams
} = extractorsFactory(extractWeights, paramMappings) } = extractorsFactory(extractWeights, paramMappings)
const conv0 = extractConvWithBatchNormParams(3, 16, 'conv0') const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams
const conv1 = extractConvWithBatchNormParams(16, 32, 'conv1') const numAnchorEncodings = withSeparableConvs ? 5 : 6
const conv2 = extractConvWithBatchNormParams(32, 64, 'conv2')
const conv3 = extractConvWithBatchNormParams(64, 128, 'conv3') const conv0 = extractConvFn(3, 16, 'conv0',)
const conv4 = extractConvWithBatchNormParams(128, 256, 'conv4') const conv1 = extractConvFn(16, 32, 'conv1')
const conv5 = extractConvWithBatchNormParams(256, 512, 'conv5') const conv2 = extractConvFn(32, 64, 'conv2')
const conv6 = extractConvWithBatchNormParams(512, 1024, 'conv6') const conv3 = extractConvFn(64, 128, 'conv3')
const conv7 = extractConvWithBatchNormParams(1024, 1024, 'conv7') const conv4 = extractConvFn(128, 256, 'conv4')
const conv8 = extractConvParams(1024, 30, 1, 'conv8') const conv5 = extractConvFn(256, 512, 'conv5')
const conv6 = extractConvFn(512, 1024, 'conv6')
const conv7 = extractConvFn(1024, 1024, 'conv7')
const conv8 = extractConvParams(1024, 5 * numAnchorEncodings, 1, 'conv8')
if (getRemainingWeights().length !== 0) { if (getRemainingWeights().length !== 0) {
throw new Error(`weights remaing after extract: ${getRemainingWeights().length}`) throw new Error(`weights remaing after extract: ${getRemainingWeights().length}`)
......
...@@ -4,7 +4,7 @@ export function getDefaultParams(params: TinyYolov2ForwardParams) { ...@@ -4,7 +4,7 @@ export function getDefaultParams(params: TinyYolov2ForwardParams) {
return Object.assign( return Object.assign(
{}, {},
{ {
sizeType: SizeType.MD, inputSize: SizeType.MD,
scoreThreshold: 0.5 scoreThreshold: 0.5
}, },
params params
......
...@@ -2,8 +2,8 @@ import { TinyYolov2 } from './TinyYolov2'; ...@@ -2,8 +2,8 @@ import { TinyYolov2 } from './TinyYolov2';
export * from './TinyYolov2'; export * from './TinyYolov2';
export function createTinyYolov2(weights: Float32Array) { export function createTinyYolov2(weights: Float32Array, withSeparableConvs: boolean = true) {
const net = new TinyYolov2() const net = new TinyYolov2(withSeparableConvs)
net.extractWeights(weights) net.extractWeights(weights)
return net return net
} }
\ No newline at end of file
...@@ -2,6 +2,8 @@ import * as tf from '@tensorflow/tfjs-core'; ...@@ -2,6 +2,8 @@ import * as tf from '@tensorflow/tfjs-core';
export function leaky(x: tf.Tensor4D): tf.Tensor4D { export function leaky(x: tf.Tensor4D): tf.Tensor4D {
return tf.tidy(() => { return tf.tidy(() => {
return tf.maximum(x, tf.mul(x, tf.scalar(0.10000000149011612))) const min = tf.mul(x, tf.scalar(0.10000000149011612))
return tf.add(tf.relu(tf.sub(x, min)), min)
//return tf.maximum(x, min)
}) })
} }
\ No newline at end of file
...@@ -4,9 +4,10 @@ import { disposeUnusedWeightTensors } from '../commons/disposeUnusedWeightTensor ...@@ -4,9 +4,10 @@ import { disposeUnusedWeightTensors } from '../commons/disposeUnusedWeightTensor
import { extractWeightEntryFactory } from '../commons/extractWeightEntryFactory'; import { extractWeightEntryFactory } from '../commons/extractWeightEntryFactory';
import { loadWeightMap } from '../commons/loadWeightMap'; import { loadWeightMap } from '../commons/loadWeightMap';
import { ConvParams, ParamMapping } from '../commons/types'; import { ConvParams, ParamMapping } from '../commons/types';
import { BatchNorm, ConvWithBatchNorm, NetParams } from './types'; import { BatchNorm, ConvWithBatchNorm, NetParams, SeparableConvParams } from './types';
const DEFAULT_MODEL_NAME = 'tiny_yolov2_model' const DEFAULT_MODEL_NAME = 'tiny_yolov2_model'
const DEFAULT_MODEL_NAME_SEPARABLE_CONV = 'tiny_yolov2_separable_conv_model'
function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) { function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {
...@@ -30,35 +31,51 @@ function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) { ...@@ -30,35 +31,51 @@ function extractorsFactory(weightMap: any, paramMappings: ParamMapping[]) {
return { conv, bn } return { conv, bn }
} }
function extractSeparableConvParams(prefix: string): SeparableConvParams {
const depthwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/depthwise_filter`, 4)
const pointwise_filter = extractWeightEntry<tf.Tensor4D>(`${prefix}/pointwise_filter`, 4)
const bias = extractWeightEntry<tf.Tensor1D>(`${prefix}/bias`, 1)
return new SeparableConvParams(
depthwise_filter,
pointwise_filter,
bias
)
}
return { return {
extractConvParams, extractConvParams,
extractConvWithBatchNormParams extractConvWithBatchNormParams,
extractSeparableConvParams
} }
} }
export async function loadQuantizedParams( export async function loadQuantizedParams(
uri: string | undefined uri: string | undefined,
withSeparableConvs: boolean
): Promise<{ params: NetParams, paramMappings: ParamMapping[] }> { ): Promise<{ params: NetParams, paramMappings: ParamMapping[] }> {
const weightMap = await loadWeightMap(uri, DEFAULT_MODEL_NAME) const weightMap = await loadWeightMap(uri, withSeparableConvs ? DEFAULT_MODEL_NAME_SEPARABLE_CONV : DEFAULT_MODEL_NAME)
const paramMappings: ParamMapping[] = [] const paramMappings: ParamMapping[] = []
const { const {
extractConvParams, extractConvParams,
extractConvWithBatchNormParams extractConvWithBatchNormParams,
extractSeparableConvParams
} = extractorsFactory(weightMap, paramMappings) } = extractorsFactory(weightMap, paramMappings)
const extractConvFn = withSeparableConvs ? extractSeparableConvParams : extractConvWithBatchNormParams
const params = { const params = {
conv0: extractConvWithBatchNormParams('conv0'), conv0: extractConvFn('conv0'),
conv1: extractConvWithBatchNormParams('conv1'), conv1: extractConvFn('conv1'),
conv2: extractConvWithBatchNormParams('conv2'), conv2: extractConvFn('conv2'),
conv3: extractConvWithBatchNormParams('conv3'), conv3: extractConvFn('conv3'),
conv4: extractConvWithBatchNormParams('conv4'), conv4: extractConvFn('conv4'),
conv5: extractConvWithBatchNormParams('conv5'), conv5: extractConvFn('conv5'),
conv6: extractConvWithBatchNormParams('conv6'), conv6: extractConvFn('conv6'),
conv7: extractConvWithBatchNormParams('conv7'), conv7: extractConvFn('conv7'),
conv8: extractConvParams('conv8') conv8: extractConvParams('conv8')
} }
......
import * as tf from '@tensorflow/tfjs-core'; import * as tf from '@tensorflow/tfjs-core';
import { ConvParams } from '../commons/types'; import { ConvParams } from '../commons/types';
import { Point } from '../Point';
export type BatchNorm = { export type BatchNorm = {
sub: tf.Tensor1D sub: tf.Tensor1D
truediv: tf.Tensor1D truediv: tf.Tensor1D
} }
export class SeparableConvParams {
constructor(
public depthwise_filter: tf.Tensor4D,
public pointwise_filter: tf.Tensor4D,
public bias: tf.Tensor1D
) {}
}
export type ConvWithBatchNorm = { export type ConvWithBatchNorm = {
conv: ConvParams conv: ConvParams
bn: BatchNorm bn: BatchNorm
} }
export type NetParams = { export type NetParams = {
conv0: ConvWithBatchNorm conv0: ConvWithBatchNorm | SeparableConvParams
conv1: ConvWithBatchNorm conv1: ConvWithBatchNorm | SeparableConvParams
conv2: ConvWithBatchNorm conv2: ConvWithBatchNorm | SeparableConvParams
conv3: ConvWithBatchNorm conv3: ConvWithBatchNorm | SeparableConvParams
conv4: ConvWithBatchNorm conv4: ConvWithBatchNorm | SeparableConvParams
conv5: ConvWithBatchNorm conv5: ConvWithBatchNorm | SeparableConvParams
conv6: ConvWithBatchNorm conv6: ConvWithBatchNorm | SeparableConvParams
conv7: ConvWithBatchNorm conv7: ConvWithBatchNorm | SeparableConvParams
conv8: ConvParams conv8: ConvParams
} }
...@@ -34,4 +43,9 @@ export enum SizeType { ...@@ -34,4 +43,9 @@ export enum SizeType {
export type TinyYolov2ForwardParams = { export type TinyYolov2ForwardParams = {
inputSize?: SizeType | number inputSize?: SizeType | number
scoreThreshold?: number scoreThreshold?: number
}
export type PostProcessingParams = {
scoreThreshold?: number
paddings: Point
} }
\ No newline at end of file
...@@ -15,6 +15,14 @@ export function round(num: number) { ...@@ -15,6 +15,14 @@ export function round(num: number) {
return Math.floor(num * 100) / 100 return Math.floor(num * 100) / 100
} }
export function sigmoid(x: number) {
return 1 / (1 + Math.exp(-x))
}
export function isDimensions(obj: any): boolean {
return obj && obj.width && obj.height
}
export function resolveInput(arg: string | any) { export function resolveInput(arg: string | any) {
if (typeof arg === 'string') { if (typeof arg === 'string') {
return document.getElementById(arg) return document.getElementById(arg)
...@@ -37,14 +45,14 @@ export function awaitMediaLoaded(media: HTMLImageElement | HTMLVideoElement | HT ...@@ -37,14 +45,14 @@ export function awaitMediaLoaded(media: HTMLImageElement | HTMLVideoElement | HT
if (!e.currentTarget) return if (!e.currentTarget) return
e.currentTarget.removeEventListener('load', onLoad) e.currentTarget.removeEventListener('load', onLoad)
e.currentTarget.removeEventListener('error', onError) e.currentTarget.removeEventListener('error', onError)
resolve() resolve(e)
} }
function onError(e: Event) { function onError(e: Event) {
if (!e.currentTarget) return if (!e.currentTarget) return
e.currentTarget.removeEventListener('load', onLoad) e.currentTarget.removeEventListener('load', onLoad)
e.currentTarget.removeEventListener('error', onError) e.currentTarget.removeEventListener('error', onError)
reject() reject(e)
} }
media.addEventListener('load', onLoad) media.addEventListener('load', onLoad)
......
...@@ -6,7 +6,7 @@ import { toNetInput } from '../../../src'; ...@@ -6,7 +6,7 @@ import { toNetInput } from '../../../src';
import * as tf from '@tensorflow/tfjs-core'; import * as tf from '@tensorflow/tfjs-core';
import { Point } from '../../../src/Point'; import { Point } from '../../../src/Point';
describe('allFaces', () => { describe('allFacesSsdMobilenetv1', () => {
let imgEl: HTMLImageElement let imgEl: HTMLImageElement
let facesFaceLandmarkPositions: Point[][] let facesFaceLandmarkPositions: Point[][]
...@@ -19,14 +19,14 @@ describe('allFaces', () => { ...@@ -19,14 +19,14 @@ describe('allFaces', () => {
facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json() facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
}) })
describeWithNets('computes full face descriptions', { withAllFaces: true }, ({ allFaces }) => { describeWithNets('computes full face descriptions', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {
const expectedScores = [0.97, 0.88, 0.83, 0.82, 0.59, 0.52] const expectedScores = [0.97, 0.88, 0.83, 0.82, 0.59, 0.52]
const maxBoxDelta = 5 const maxBoxDelta = 5
const maxLandmarkPointsDelta = 1 const maxLandmarkPointsDelta = 1
it('scores > 0.8', async () => { it('scores > 0.8', async () => {
const results = await allFaces(imgEl, 0.8) const results = await allFacesSsdMobilenetv1(imgEl, 0.8)
expect(results.length).toEqual(4) expect(results.length).toEqual(4)
results.forEach(({ detection, landmarks, descriptor }, i) => { results.forEach(({ detection, landmarks, descriptor }, i) => {
...@@ -40,7 +40,7 @@ describe('allFaces', () => { ...@@ -40,7 +40,7 @@ describe('allFaces', () => {
}) })
it('scores > 0.5', async () => { it('scores > 0.5', async () => {
const results = await allFaces(imgEl, 0.5) const results = await allFacesSsdMobilenetv1(imgEl, 0.5)
expect(results.length).toEqual(6) expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => { results.forEach(({ detection, landmarks, descriptor }, i) => {
...@@ -55,11 +55,11 @@ describe('allFaces', () => { ...@@ -55,11 +55,11 @@ describe('allFaces', () => {
}) })
describeWithNets('no memory leaks', { withAllFaces: true }, ({ allFaces }) => { describeWithNets('no memory leaks', { withAllFacesSsdMobilenetv1: true }, ({ allFacesSsdMobilenetv1 }) => {
it('single image element', async () => { it('single image element', async () => {
await expectAllTensorsReleased(async () => { await expectAllTensorsReleased(async () => {
await allFaces(imgEl) await allFacesSsdMobilenetv1(imgEl)
}) })
}) })
...@@ -68,7 +68,7 @@ describe('allFaces', () => { ...@@ -68,7 +68,7 @@ describe('allFaces', () => {
await expectAllTensorsReleased(async () => { await expectAllTensorsReleased(async () => {
const netInput = (new NetInput([tensor])).managed() const netInput = (new NetInput([tensor])).managed()
await allFaces(netInput) await allFacesSsdMobilenetv1(netInput)
}) })
tensor.dispose() tensor.dispose()
...@@ -78,7 +78,7 @@ describe('allFaces', () => { ...@@ -78,7 +78,7 @@ describe('allFaces', () => {
const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D
await expectAllTensorsReleased(async () => { await expectAllTensorsReleased(async () => {
await allFaces(await toNetInput(tensor, true)) await allFacesSsdMobilenetv1(await toNetInput(tensor, true))
}) })
tensor.dispose() tensor.dispose()
......
import * as faceapi from '../../../src';
import { describeWithNets, expectAllTensorsReleased, expectRectClose, expectPointClose, expectMaxDelta } from '../../utils';
import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
import { NetInput } from '../../../src/NetInput';
import { toNetInput } from '../../../src';
import * as tf from '@tensorflow/tfjs-core';
import { Point } from '../../../src/Point';
import { SizeType } from '../../../src/tinyYolov2/types';
describe('allFacesTinyYolov2', () => {
let imgEl: HTMLImageElement
let facesFaceLandmarkPositions: Point[][]
let facesFaceDescriptors: number[][]
beforeAll(async () => {
const img = await (await fetch('base/test/images/faces.jpg')).blob()
imgEl = await faceapi.bufferToImage(img)
facesFaceLandmarkPositions = await (await fetch('base/test/data/facesFaceLandmarkPositions.json')).json()
facesFaceDescriptors = await (await fetch('base/test/data/facesFaceDescriptorsSsd.json')).json()
})
describeWithNets('computes full face descriptions', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
it('SizeType.LG', async () => {
const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
const maxBoxDelta = 5
const maxLandmarkPointsDelta = 10
const maxDescriptorDelta = 0.06
const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.LG })
const detectionOrder = [0, 2, 3, 4, 1, 5]
expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => {
expect(detection.getImageWidth()).toEqual(imgEl.width)
expect(detection.getImageHeight()).toEqual(imgEl.height)
expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[i], maxBoxDelta)
landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
})
})
it('SizeType.MD', async () => {
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const maxLandmarkPointsDelta = 16
const maxDescriptorDelta = 0.05
const results = await allFacesTinyYolov2(imgEl, { inputSize: SizeType.MD })
const boxOrder = [5, 1, 4, 3, 2, 0]
const detectionOrder = [5, 2, 1, 4, 3, 0]
expect(results.length).toEqual(6)
results.forEach(({ detection, landmarks, descriptor }, i) => {
expect(detection.getImageWidth()).toEqual(imgEl.width)
expect(detection.getImageHeight()).toEqual(imgEl.height)
expect(detection.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(detection.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
landmarks.getPositions().forEach((pt, j) => expectPointClose(pt, facesFaceLandmarkPositions[detectionOrder[i]][j], maxLandmarkPointsDelta))
descriptor.forEach((val, j) => expectMaxDelta(val, facesFaceDescriptors[detectionOrder[i]][j], maxDescriptorDelta))
})
})
})
describeWithNets('no memory leaks', { withAllFacesTinyYolov2: true }, ({ allFacesTinyYolov2 }) => {
it('single image element', async () => {
await expectAllTensorsReleased(async () => {
await allFacesTinyYolov2(imgEl)
})
})
it('single tf.Tensor3D', async () => {
const tensor = tf.fromPixels(imgEl)
await expectAllTensorsReleased(async () => {
const netInput = (new NetInput([tensor])).managed()
await allFacesTinyYolov2(netInput)
})
tensor.dispose()
})
it('single batch size 1 tf.Tensor4Ds', async () => {
const tensor = tf.tidy(() => tf.fromPixels(imgEl).expandDims()) as tf.Tensor4D
await expectAllTensorsReleased(async () => {
await allFacesTinyYolov2(await toNetInput(tensor, true))
})
tensor.dispose()
})
})
})
\ No newline at end of file
...@@ -29,6 +29,15 @@ export const expectedTinyYolov2Boxes = [ ...@@ -29,6 +29,15 @@ export const expectedTinyYolov2Boxes = [
{ x: 87, y: 30, width: 92, height: 93 } { x: 87, y: 30, width: 92, height: 93 }
] ]
export const expectedTinyYolov2SeparableConvBoxes = [
{ x: 42, y: 257, width: 111, height: 121 },
{ x: 454, y: 175, width: 104, height: 121 },
{ x: 230, y: 45, width: 94, height: 104 },
{ x: 574, y: 62, width: 88, height: 113 },
{ x: 260, y: 233, width: 82, height: 104 },
{ x: 83, y: 24, width: 85, height: 111 }
]
export const expectedMtcnnFaceLandmarks = [ export const expectedMtcnnFaceLandmarks = [
[new Point(117, 58), new Point(156, 63), new Point(141, 86), new Point(109, 98), new Point(147, 104)], [new Point(117, 58), new Point(156, 63), new Point(141, 86), new Point(109, 98), new Point(147, 104)],
[new Point(82, 292), new Point(134, 304), new Point(104, 330), new Point(72, 342), new Point(120, 353)], [new Point(82, 292), new Point(134, 304), new Point(104, 330), new Point(72, 342), new Point(120, 353)],
...@@ -38,7 +47,6 @@ export const expectedMtcnnFaceLandmarks = [ ...@@ -38,7 +47,6 @@ export const expectedMtcnnFaceLandmarks = [
[new Point(489, 224), new Point(534, 223), new Point(507, 250), new Point(493, 271), new Point(530, 270)] [new Point(489, 224), new Point(534, 223), new Point(507, 250), new Point(493, 271), new Point(530, 270)]
] ]
export function expectMtcnnResults( export function expectMtcnnResults(
results: { faceDetection: faceapi.FaceDetection, faceLandmarks: faceapi.FaceLandmarks5 }[], results: { faceDetection: faceapi.FaceDetection, faceLandmarks: faceapi.FaceLandmarks5 }[],
boxOrder: number[], boxOrder: number[],
......
import * as faceapi from '../../../src'; import * as faceapi from '../../../src';
import { SizeType } from '../../../src/tinyYolov2/types'; import { SizeType } from '../../../src/tinyYolov2/types';
import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils'; import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
import { expectedTinyYolov2Boxes } from './expectedResults'; import { expectedTinyYolov2Boxes, expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
describe('tinyYolov2', () => { describe('tinyYolov2', () => {
...@@ -13,7 +12,7 @@ describe('tinyYolov2', () => { ...@@ -13,7 +12,7 @@ describe('tinyYolov2', () => {
imgEl = await faceapi.bufferToImage(img) imgEl = await faceapi.bufferToImage(img)
}) })
describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => { describeWithNets('quantized weights', { withTinyYolov2: { quantized: true, withSeparableConv: false } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => { it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG }) const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
...@@ -59,7 +58,7 @@ describe('tinyYolov2', () => { ...@@ -59,7 +58,7 @@ describe('tinyYolov2', () => {
}) })
describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false } }, ({ tinyYolov2 }) => { describeWithNets('uncompressed weights', { withTinyYolov2: { quantized: false, withSeparableConv: false } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => { it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG }) const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
...@@ -113,7 +112,7 @@ describe('tinyYolov2', () => { ...@@ -113,7 +112,7 @@ describe('tinyYolov2', () => {
await expectAllTensorsReleased(async () => { await expectAllTensorsReleased(async () => {
const res = await fetch('base/weights_uncompressed/tiny_yolov2_model.weights') const res = await fetch('base/weights_uncompressed/tiny_yolov2_model.weights')
const weights = new Float32Array(await res.arrayBuffer()) const weights = new Float32Array(await res.arrayBuffer())
const net = faceapi.createTinyYolov2(weights) const net = faceapi.createTinyYolov2(weights, false)
net.dispose() net.dispose()
}) })
}) })
...@@ -124,8 +123,8 @@ describe('tinyYolov2', () => { ...@@ -124,8 +123,8 @@ describe('tinyYolov2', () => {
it('disposes all param tensors', async () => { it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => { await expectAllTensorsReleased(async () => {
const net = new faceapi.TinyYolov2() const net = new faceapi.TinyYolov2(false)
await net.load('base/weights') await net.load('base/weights_unused')
net.dispose() net.dispose()
}) })
}) })
......
import * as faceapi from '../../../src';
import { SizeType } from '../../../src/tinyYolov2/types';
import { describeWithNets, expectAllTensorsReleased, expectRectClose } from '../../utils';
import { expectedTinyYolov2SeparableConvBoxes } from './expectedResults';
describe('tinyYolov2, with separable convolutions', () => {
let imgEl: HTMLImageElement
beforeAll(async () => {
const img = await (await fetch('base/test/images/faces.jpg')).blob()
imgEl = await faceapi.bufferToImage(img)
})
describeWithNets('quantized weights', { withTinyYolov2: { quantized: true } }, ({ tinyYolov2 }) => {
it('inputSize lg, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.LG })
const expectedScores = [0.9, 0.9, 0.89, 0.85, 0.85, 0.85]
const maxBoxDelta = 1
const boxOrder = [0, 1, 2, 3, 4, 5]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
it('inputSize md, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: SizeType.MD })
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const boxOrder = [5, 1, 4, 3, 2, 0]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
it('inputSize custom, finds all faces', async () => {
const detections = await tinyYolov2.locateFaces(imgEl, { inputSize: 416 })
const expectedScores = [0.85, 0.85, 0.84, 0.83, 0.8, 0.8]
const maxBoxDelta = 17
const boxOrder = [5, 1, 4, 3, 2, 0]
expect(detections.length).toEqual(6)
detections.forEach((det, i) => {
expect(det.getScore()).toBeCloseTo(expectedScores[i], 2)
expectRectClose(det.getBox(), expectedTinyYolov2SeparableConvBoxes[boxOrder[i]], maxBoxDelta)
})
})
})
describe('no memory leaks', () => {
describe('NeuralNetwork, uncompressed model', () => {
it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => {
const res = await fetch('base/weights_uncompressed/tiny_yolov2_separable_conv_model.weights')
const weights = new Float32Array(await res.arrayBuffer())
const net = faceapi.createTinyYolov2(weights)
net.dispose()
})
})
})
describe('NeuralNetwork, quantized model', () => {
it('disposes all param tensors', async () => {
await expectAllTensorsReleased(async () => {
const net = new faceapi.TinyYolov2()
await net.load('base/weights')
net.dispose()
})
})
})
})
})
\ No newline at end of file
...@@ -4,9 +4,8 @@ import { IRect } from '../build/Rect'; ...@@ -4,9 +4,8 @@ import { IRect } from '../build/Rect';
import * as faceapi from '../src/'; import * as faceapi from '../src/';
import { NeuralNetwork } from '../src/commons/NeuralNetwork'; import { NeuralNetwork } from '../src/commons/NeuralNetwork';
import { IPoint } from '../src/'; import { IPoint } from '../src/';
import { allFacesFactory, allFacesMtcnnFactory } from '../src/allFacesFactory'; import { allFacesMtcnnFactory, allFacesSsdMobilenetv1Factory, allFacesTinyYolov2Factory } from '../src/allFacesFactory';
import { allFacesMtcnnFunction, allFacesFunction, tinyYolov2 } from '../src/globalApi'; import { allFacesMtcnnFunction, allFacesSsdMobilenetv1Function, allFacesTinyYolov2, allFacesTinyYolov2Function } from '../src/globalApi';
import { TinyYolov2 } from '../src/tinyYolov2/TinyYolov2';
export function zeros(length: number): Float32Array { export function zeros(length: number): Float32Array {
return new Float32Array(length) return new Float32Array(length)
...@@ -55,8 +54,13 @@ export type WithNetOptions = { ...@@ -55,8 +54,13 @@ export type WithNetOptions = {
quantized?: boolean quantized?: boolean
} }
export type WithTinyYolov2Options = WithNetOptions & {
withSeparableConv?: boolean
}
export type InjectNetArgs = { export type InjectNetArgs = {
allFaces: allFacesFunction allFacesSsdMobilenetv1: allFacesSsdMobilenetv1Function
allFacesTinyYolov2: allFacesTinyYolov2Function
allFacesMtcnn: allFacesMtcnnFunction allFacesMtcnn: allFacesMtcnnFunction
faceDetectionNet: faceapi.FaceDetectionNet faceDetectionNet: faceapi.FaceDetectionNet
faceLandmarkNet: faceapi.FaceLandmarkNet faceLandmarkNet: faceapi.FaceLandmarkNet
...@@ -67,13 +71,14 @@ export type InjectNetArgs = { ...@@ -67,13 +71,14 @@ export type InjectNetArgs = {
export type DescribeWithNetsOptions = { export type DescribeWithNetsOptions = {
withAllFaces?: boolean withAllFacesSsdMobilenetv1?: boolean
withAllFacesTinyYolov2?: boolean
withAllFacesMtcnn?: boolean withAllFacesMtcnn?: boolean
withFaceDetectionNet?: WithNetOptions withFaceDetectionNet?: WithNetOptions
withFaceLandmarkNet?: WithNetOptions withFaceLandmarkNet?: WithNetOptions
withFaceRecognitionNet?: WithNetOptions withFaceRecognitionNet?: WithNetOptions
withMtcnn?: WithNetOptions withMtcnn?: WithNetOptions
withTinyYolov2?: WithNetOptions withTinyYolov2?: WithTinyYolov2Options
} }
async function loadNetWeights(uri: string): Promise<Float32Array> { async function loadNetWeights(uri: string): Promise<Float32Array> {
...@@ -82,12 +87,13 @@ async function loadNetWeights(uri: string): Promise<Float32Array> { ...@@ -82,12 +87,13 @@ async function loadNetWeights(uri: string): Promise<Float32Array> {
async function initNet<TNet extends NeuralNetwork<any>>( async function initNet<TNet extends NeuralNetwork<any>>(
net: TNet, net: TNet,
uncompressedFilename: string | boolean uncompressedFilename: string | boolean,
isUnusedModel: boolean = false
) { ) {
await net.load( await net.load(
uncompressedFilename uncompressedFilename
? await loadNetWeights(`base/weights_uncompressed/${uncompressedFilename}`) ? await loadNetWeights(`base/weights_uncompressed/${uncompressedFilename}`)
: 'base/weights' : (isUnusedModel ? 'base/weights_unused' : 'base/weights')
) )
} }
...@@ -102,13 +108,15 @@ export function describeWithNets( ...@@ -102,13 +108,15 @@ export function describeWithNets(
let faceLandmarkNet: faceapi.FaceLandmarkNet = new faceapi.FaceLandmarkNet() let faceLandmarkNet: faceapi.FaceLandmarkNet = new faceapi.FaceLandmarkNet()
let faceRecognitionNet: faceapi.FaceRecognitionNet = new faceapi.FaceRecognitionNet() let faceRecognitionNet: faceapi.FaceRecognitionNet = new faceapi.FaceRecognitionNet()
let mtcnn: faceapi.Mtcnn = new faceapi.Mtcnn() let mtcnn: faceapi.Mtcnn = new faceapi.Mtcnn()
let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2() let tinyYolov2: faceapi.TinyYolov2 = new faceapi.TinyYolov2(options.withTinyYolov2 && options.withTinyYolov2.withSeparableConv)
let allFaces = allFacesFactory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet) let allFacesSsdMobilenetv1 = allFacesSsdMobilenetv1Factory(faceDetectionNet, faceLandmarkNet, faceRecognitionNet)
let allFacesTinyYolov2 = allFacesTinyYolov2Factory(tinyYolov2, faceLandmarkNet, faceRecognitionNet)
let allFacesMtcnn = allFacesMtcnnFactory(mtcnn, faceRecognitionNet) let allFacesMtcnn = allFacesMtcnnFactory(mtcnn, faceRecognitionNet)
beforeAll(async () => { beforeAll(async () => {
const { const {
withAllFaces, withAllFacesSsdMobilenetv1,
withAllFacesTinyYolov2,
withAllFacesMtcnn, withAllFacesMtcnn,
withFaceDetectionNet, withFaceDetectionNet,
withFaceLandmarkNet, withFaceLandmarkNet,
...@@ -117,21 +125,21 @@ export function describeWithNets( ...@@ -117,21 +125,21 @@ export function describeWithNets(
withTinyYolov2 withTinyYolov2
} = options } = options
if (withFaceDetectionNet || withAllFaces) { if (withFaceDetectionNet || withAllFacesSsdMobilenetv1) {
await initNet<faceapi.FaceDetectionNet>( await initNet<faceapi.FaceDetectionNet>(
faceDetectionNet, faceDetectionNet,
!!withFaceDetectionNet && !withFaceDetectionNet.quantized && 'ssd_mobilenetv1_model.weights' !!withFaceDetectionNet && !withFaceDetectionNet.quantized && 'ssd_mobilenetv1_model.weights'
) )
} }
if (withFaceLandmarkNet || withAllFaces) { if (withFaceLandmarkNet || withAllFacesSsdMobilenetv1 || withAllFacesTinyYolov2) {
await initNet<faceapi.FaceLandmarkNet>( await initNet<faceapi.FaceLandmarkNet>(
faceLandmarkNet, faceLandmarkNet,
!!withFaceLandmarkNet && !withFaceLandmarkNet.quantized && 'face_landmark_68_model.weights' !!withFaceLandmarkNet && !withFaceLandmarkNet.quantized && 'face_landmark_68_model.weights'
) )
} }
if (withFaceRecognitionNet || withAllFaces || withAllFacesMtcnn) { if (withFaceRecognitionNet || withAllFacesSsdMobilenetv1 || withAllFacesMtcnn || withAllFacesTinyYolov2) {
await initNet<faceapi.FaceRecognitionNet>( await initNet<faceapi.FaceRecognitionNet>(
faceRecognitionNet, faceRecognitionNet,
// TODO: figure out why quantized weights results in NaNs in testcases // TODO: figure out why quantized weights results in NaNs in testcases
...@@ -146,10 +154,11 @@ export function describeWithNets( ...@@ -146,10 +154,11 @@ export function describeWithNets(
) )
} }
if (withTinyYolov2) { if (withTinyYolov2 || withAllFacesTinyYolov2) {
await initNet<faceapi.TinyYolov2>( await initNet<faceapi.TinyYolov2>(
tinyYolov2, tinyYolov2,
!!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights' !!withTinyYolov2 && !withTinyYolov2.quantized && 'tiny_yolov2_model.weights',
withTinyYolov2 && withTinyYolov2.withSeparableConv === false
) )
} }
}) })
...@@ -162,7 +171,16 @@ export function describeWithNets( ...@@ -162,7 +171,16 @@ export function describeWithNets(
tinyYolov2 && tinyYolov2.dispose() tinyYolov2 && tinyYolov2.dispose()
}) })
specDefinitions({ allFaces, allFacesMtcnn, faceDetectionNet, faceLandmarkNet, faceRecognitionNet, mtcnn, tinyYolov2 }) specDefinitions({
allFacesSsdMobilenetv1,
allFacesTinyYolov2,
allFacesMtcnn,
faceDetectionNet,
faceLandmarkNet,
faceRecognitionNet,
mtcnn,
tinyYolov2
})
}) })
} }
...@@ -9,9 +9,10 @@ app.use(express.static(path.join(__dirname, './public'))) ...@@ -9,9 +9,10 @@ app.use(express.static(path.join(__dirname, './public')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver'))) app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public'))) app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights'))) app.use(express.static(path.join(__dirname, '../../weights')))
app.use(express.static(path.join(__dirname, '../../weights_uncompressed')))
app.use(express.static(path.join(__dirname, '../../dist'))) app.use(express.static(path.join(__dirname, '../../dist')))
app.get('/', (req, res) => res.redirect('/quantize_tiny_yolov2')) app.get('/', (req, res) => res.redirect('/quantize_model'))
app.get('/quantize_tiny_yolov2', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeTinyYolov2.html'))) app.get('/quantize_model', (req, res) => res.sendFile(path.join(viewsDir, 'quantizeModel.html')))
app.listen(3000, () => console.log('Listening on port 3000!')) app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
...@@ -10,6 +10,10 @@ ...@@ -10,6 +10,10 @@
<script> <script>
tf = faceapi.tf tf = faceapi.tf
const modelName = 'tiny_yolov2_separable_conv'
const uncompressedWeightsUri = `tiny_yolov2_separable_conv_model_v1.weights`
const net = new faceapi.TinyYolov2(true)
async function loadNetWeights(uri) { async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer()) return new Float32Array(await (await fetch(uri)).arrayBuffer())
} }
...@@ -36,15 +40,13 @@ ...@@ -36,15 +40,13 @@
} }
async function quantizeAndSave() { async function quantizeAndSave() {
const modelName = 'tiny_yolov2'
const tinyYolov2 = new faceapi.TinyYolov2() await net.load(await loadNetWeights(uncompressedWeightsUri))
await tinyYolov2.load(await loadNetWeights(`uncompressed/tiny_yolov2_model.weights`))
const quantizedTensorArrays = [] const quantizedTensorArrays = []
const weightEntries = [] const weightEntries = []
tinyYolov2.getParamList().forEach(({ path, tensor }) => { net.getParamList().forEach(({ path, tensor }) => {
const { scale, min, qdata } = quantizeWeights(tensor) const { scale, min, qdata } = quantizeWeights(tensor)
const weightEntry = { const weightEntry = {
......
async function promiseSequential(promises) {
const curr = promises[0]
if (!curr) {
return
}
await curr()
return promiseSequential(promises.slice(1))
}
async function trainStep(batchCreators) { async function trainStep(batchCreators) {
await promiseSequential(batchCreators.map((batchCreator, dataIdx) => async () => { await promiseSequential(batchCreators.map((batchCreator, dataIdx) => async () => {
const { batchInput, landmarksBatchTensor } = await batchCreator() const { batchInput, landmarksBatchTensor } = await batchCreator()
let ts = Date.now() let ts = Date.now()
const cost = optimizer.minimize(() => { const cost = optimizer.minimize(() => {
const out = window.trainNet.forwardInput(batchInput.managed()) const out = window.trainNet.forwardInput(batchInput.managed())
...@@ -19,7 +10,7 @@ async function trainStep(batchCreators) { ...@@ -19,7 +10,7 @@ async function trainStep(batchCreators) {
landmarksBatchTensor, landmarksBatchTensor,
out out
) )
return loss return tf.sum(out)
}, true) }, true)
ts = Date.now() - ts ts = Date.now() - ts
...@@ -29,6 +20,7 @@ async function trainStep(batchCreators) { ...@@ -29,6 +20,7 @@ async function trainStep(batchCreators) {
cost.dispose() cost.dispose()
await tf.nextFrame() await tf.nextFrame()
console.log(tf.memory())
})) }))
} }
...@@ -73,7 +65,7 @@ function landmarkPositionsToArray(landmarks) { ...@@ -73,7 +65,7 @@ function landmarkPositionsToArray(landmarks) {
} }
function toFaceLandmarks(landmarks, { naturalWidth, naturalHeight }) { function toFaceLandmarks(landmarks, { naturalWidth, naturalHeight }) {
return new faceapi.FaceLandmarks( return new faceapi.FaceLandmarks68(
landmarks.map(l => new faceapi.Point(l.x / naturalWidth, l.y / naturalHeight)), landmarks.map(l => new faceapi.Point(l.x / naturalWidth, l.y / naturalHeight)),
{ width: naturalWidth, height: naturalHeight } { width: naturalWidth, height: naturalHeight }
) )
...@@ -100,20 +92,11 @@ async function getTrainData() { ...@@ -100,20 +92,11 @@ async function getTrainData() {
(_, i) => landmarksJson[i] (_, i) => landmarksJson[i]
) )
return await loadImagesInBatch(allLandmarks.slice(0, 100))
/**
const batch1 = await loadImagesInBatch(allLandmarks.slice(0, 4000)) const batch1 = await loadImagesInBatch(allLandmarks.slice(0, 4000))
const batch2 = await loadImagesInBatch(allLandmarks.slice(4000), 4000) const batch2 = await loadImagesInBatch(allLandmarks.slice(4000), 4000)
return batch1.concat(batch2) return batch1.concat(batch2)
} */
// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
function shuffle(a) {
var j, x, i;
for (i = a.length - 1; i > 0; i--) {
j = Math.floor(Math.random() * (i + 1));
x = a[i];
a[i] = a[j];
a[j] = x;
}
return a;
} }
\ No newline at end of file
...@@ -60,7 +60,7 @@ async function init() { ...@@ -60,7 +60,7 @@ async function init() {
//window.nets.push(await loadNet('retrained/landmarks_v0.weights')) //window.nets.push(await loadNet('retrained/landmarks_v0.weights'))
//window.nets.push(await loadNet('retrained/landmarks_v2.weights')) //window.nets.push(await loadNet('retrained/landmarks_v2.weights'))
window.trainNet = await loadNet('retrained/landmarks_v6.weights') window.trainNet = await loadNet('/tmp/retrained/landmarks_v9.weights')
window.nets.push(trainNet) window.nets.push(trainNet)
$('#loader').hide() $('#loader').hide()
......
...@@ -10,6 +10,7 @@ ...@@ -10,6 +10,7 @@
<script src="faceLandmarksUi.js"></script> <script src="faceLandmarksUi.js"></script>
<script src="faceLandmarksTrain.js"></script> <script src="faceLandmarksTrain.js"></script>
<script src="FileSaver.js"></script> <script src="FileSaver.js"></script>
<script src="trainUtils.js"></script>
</head> </head>
<body> <body>
<div id="navbar"></div> <div id="navbar"></div>
...@@ -60,7 +61,7 @@ ...@@ -60,7 +61,7 @@
await train() await train()
} }
async function train(batchSize = 10) { async function train(batchSize = 1) {
for (let i = 0; i < trainSteps; i++) { for (let i = 0; i < trainSteps; i++) {
console.log('step', i) console.log('step', i)
const batchCreators = createBatchCreators(shuffle(window.trainData), batchSize) const batchCreators = createBatchCreators(shuffle(window.trainData), batchSize)
...@@ -69,20 +70,11 @@ ...@@ -69,20 +70,11 @@
ts = Date.now() - ts ts = Date.now() - ts
console.log('step %s done (%s ms)', i, ts) console.log('step %s done (%s ms)', i, ts)
if (((i + 1) % saveEveryNthIteration) === 0) { if (((i + 1) % saveEveryNthIteration) === 0) {
saveWeights(i) //saveWeights(window.trainNet, 'landmark_trained_weights_' + idx + '.weights')
} }
} }
} }
function saveWeights(idx = 0) {
const binaryWeights = new Float32Array(
window.trainNet.getParamList()
.map(({ tensor }) => Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), 'landmark_trained_weights_' + idx + '.weights')
}
</script> </script>
</body> </body>
......
module.exports = function(config) {
config.set({
frameworks: ['jasmine', 'karma-typescript'],
files: [
'tinyYolov2/**/*.ts'
],
preprocessors: {
'**/*.ts': ['karma-typescript']
},
karmaTypescriptConfig: {
tsconfig: './tsconfig.test.json'
},
browsers: ['Chrome'],
browserNoActivityTimeout: 60000,
client: {
jasmine: {
timeoutInterval: 30000
}
}
})
}
{ {
"scripts": { "scripts": {
"start": "node server.js" "start": "node server.js",
"test": "karma start"
}, },
"author": "justadudewhohacks", "author": "justadudewhohacks",
"license": "MIT", "license": "MIT",
......
require('./.env') require('./faceLandmarks/.env')
const express = require('express') const express = require('express')
const path = require('path') const path = require('path')
const app = express() const app = express()
const viewsDir = path.join(__dirname, 'views') const publicDir = path.join(__dirname, './faceLandmarks')
app.use(express.static(viewsDir)) app.use(express.static(publicDir))
app.use(express.static(path.join(__dirname, './public'))) app.use(express.static(path.join(__dirname, './shared')))
app.use(express.static(path.join(__dirname, './tmp')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver'))) app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public'))) app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights'))) app.use(express.static(path.join(__dirname, '../../weights')))
...@@ -18,6 +17,6 @@ const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH) ...@@ -18,6 +17,6 @@ const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
app.use(express.static(trainDataPath)) app.use(express.static(trainDataPath))
app.get('/', (req, res) => res.redirect('/face_landmarks')) app.get('/', (req, res) => res.redirect('/face_landmarks'))
app.get('/face_landmarks', (req, res) => res.sendFile(path.join(viewsDir, 'faceLandmarks.html'))) app.get('/face_landmarks', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))
app.listen(3000, () => console.log('Listening on port 3000!')) app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
require('./tinyYolov2/.env')
const express = require('express')
const path = require('path')
const fs = require('fs')
const app = express()
const publicDir = path.join(__dirname, './tinyYolov2')
app.use(express.static(publicDir))
app.use(express.static(path.join(__dirname, './shared')))
app.use(express.static(path.join(__dirname, './node_modules/file-saver')))
app.use(express.static(path.join(__dirname, '../../examples/public')))
app.use(express.static(path.join(__dirname, '../../weights')))
app.use(express.static(path.join(__dirname, '../../dist')))
const trainDataPath = path.resolve(process.env.TRAIN_DATA_PATH)
const testDataPath = path.resolve(process.env.TEST_DATA_PATH)
const imagesPath = path.join(trainDataPath, './final_images')
const detectionsPath = path.join(trainDataPath, './final_detections')
app.use(express.static(imagesPath))
app.use(express.static(detectionsPath))
app.use(express.static(testDataPath))
const detectionFilenames = fs.readdirSync(detectionsPath)
const detectionFilenamesMultibox = JSON.parse(fs.readFileSync(path.join(__dirname, './tinyYolov2/multibox.json')))
app.use(express.static(trainDataPath))
app.get('/detection_filenames', (req, res) => res.status(202).send(detectionFilenames))
app.get('/detection_filenames_multibox', (req, res) => res.status(202).send(detectionFilenamesMultibox))
app.get('/', (req, res) => res.sendFile(path.join(publicDir, 'train.html')))
app.get('/verify', (req, res) => res.sendFile(path.join(publicDir, 'verify.html')))
app.get('/test', (req, res) => res.sendFile(path.join(publicDir, 'test.html')))
app.listen(3000, () => console.log('Listening on port 3000!'))
\ No newline at end of file
const log = (str, ...args) => console.log(`[${[(new Date()).toTimeString().substr(0, 8)]}] ${str || ''}`, ...args)
async function promiseSequential(promises) {
const curr = promises[0]
if (!curr) {
return
}
await curr()
return promiseSequential(promises.slice(1))
}
// https://stackoverflow.com/questions/6274339/how-can-i-shuffle-an-array
function shuffle(a) {
var j, x, i;
for (i = a.length - 1; i > 0; i--) {
j = Math.floor(Math.random() * (i + 1));
x = a[i];
a[i] = a[j];
a[j] = x;
}
return a;
}
function saveWeights(net, filename = 'train_tmp') {
const binaryWeights = new Float32Array(
net.getParamList()
.map(({ tensor }) => Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), filename)
}
function imageToSquare(img) {
const scale = 608 / Math.max(img.height, img.width)
const width = scale * img.width
const height = scale * img.height
const canvas1 = faceapi.createCanvasFromMedia(img)
const targetCanvas = faceapi.createCanvas({ width: 608, height: 608 })
targetCanvas.getContext('2d').putImageData(canvas1.getContext('2d').getImageData(0, 0, width, height), 0, 0)
return targetCanvas
}
function getPaddingsAndReshapedSize(img, inputSize) {
const [h, w] = [img.height, img.width]
const maxDim = Math.max(h, w)
const f = inputSize / maxDim
const reshapedImgDims = {
height: Math.floor(h * f),
width: Math.floor(w * f)
}
const paddings = new faceapi.Point(
maxDim / img.width,
maxDim / img.height
)
return { paddings, reshapedImgDims }
}
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.12.0"> </script>
<script src="FileSaver.js"></script>
</head>
<body>
<script>
const glorotNormal = tf.initializers.glorotNormal()
saveWeights()
function initSeparableConvWeights(inChannels, outChannels) {
return {
depthwiseFilter: glorotNormal.apply([3, 3, inChannels, 1]),
pointwiseFilter: glorotNormal.apply([1, 1, inChannels, outChannels]),
bias: tf.zeros([outChannels])
}
}
function initTinyYolov2SeparableWeights() {
const conv0 = initSeparableConvWeights(3, 16)
const conv1 = initSeparableConvWeights(16, 32)
const conv2 = initSeparableConvWeights(32, 64)
const conv3 = initSeparableConvWeights(64, 128)
const conv4 = initSeparableConvWeights(128, 256)
const conv5 = initSeparableConvWeights(256, 512)
const conv6 = initSeparableConvWeights(512, 1024)
const conv7 = initSeparableConvWeights(1024, 1024)
const conv8 = {
filters: glorotNormal.apply([1, 1, 1024, 25]),
bias: tf.zeros([25])
}
return {
conv0,
conv1,
conv2,
conv3,
conv4,
conv5,
conv6,
conv7,
conv8
}
}
function saveWeights() {
const w = initTinyYolov2SeparableWeights()
const binaryWeights = new Float32Array(
Array(8).fill(0)
.map((_, i) => w[`conv${i}`])
.map(ps => [ps.depthwiseFilter, ps.pointwiseFilter, ps.bias])
.reduce((flat, arr) => flat.concat(arr))
.concat([w.conv8.filters, w.conv8.bias])
.map(tensor => console.log(tensor) || Array.from(tensor.dataSync()))
.reduce((flat, arr) => flat.concat(arr))
)
saveAs(new Blob([binaryWeights]), 'foo.weights')
}
</script>
</body>
</html>
\ No newline at end of file
const CELL_SIZE = 32
const getNumCells = inputSize => inputSize / CELL_SIZE
const inverseSigmoid = x => Math.log(x / (1 - x))
function getAnchors() {
return window.net.anchors
}
function squaredSumOverMask(lossTensors, mask) {
return tf.tidy(() => tf.sum(tf.square(tf.mul(mask, lossTensors))))
}
function assignBoxesToAnchors(groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
return groundTruthBoxes.map(box => {
const { left, top, width, height } = box.rescale(reshapedImgDims)
const ctX = left + (width / 2)
const ctY = top + (height / 2)
const col = Math.floor((ctX / inputSize) * numCells)
const row = Math.floor((ctY / inputSize) * numCells)
const anchorsByIou = getAnchors().map((anchor, idx) => ({
idx,
iou: faceapi.iou(
new faceapi.BoundingBox(0, 0, anchor.x * CELL_SIZE, anchor.y * CELL_SIZE),
new faceapi.BoundingBox(0, 0, width, height)
)
})).sort((a1, a2) => a2.iou - a1.iou)
const anchor = anchorsByIou[0].idx
return { row, col, anchor, box }
})
}
function getGroundTruthMask(groundTruthBoxes, inputSize) {
const numCells = getNumCells(inputSize)
const mask = tf.zeros([numCells, numCells, 25])
const buf = mask.buffer()
groundTruthBoxes.forEach(({ row, col, anchor }) => {
const anchorOffset = anchor * 5
for (let i = 0; i < 5; i++) {
buf.set(1, row, col, anchorOffset + i)
}
})
return mask
}
function getCoordAndScoreMasks(inputSize) {
const numCells = getNumCells(inputSize)
const coordMask = tf.zeros([numCells, numCells, 25])
const scoreMask = tf.zeros([numCells, numCells, 25])
const coordBuf = coordMask.buffer()
const scoreBuf = scoreMask.buffer()
for (let row = 0; row < numCells; row++) {
for (let col = 0; col < numCells; col++) {
for (let anchor = 0; anchor < 5; anchor++) {
const anchorOffset = 5 * anchor
for (let i = 0; i < 4; i++) {
coordBuf.set(1, row, col, anchorOffset + i)
}
scoreBuf.set(1, row, col, anchorOffset + 4)
}
}
}
return { coordMask, scoreMask }
}
function computeBoxAdjustments(groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
const adjustments = tf.zeros([numCells, numCells, 25])
const buf = adjustments.buffer()
groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
const { left, top, right, bottom, width, height } = box.rescale(reshapedImgDims)
const centerX = (left + right) / 2
const centerY = (top + bottom) / 2
//const dCenterX = centerX - (col * CELL_SIZE + (CELL_SIZE / 2))
//const dCenterY = centerY - (row * CELL_SIZE + (CELL_SIZE / 2))
const dCenterX = centerX - (col * CELL_SIZE)
const dCenterY = centerY - (row * CELL_SIZE)
const dx = inverseSigmoid(dCenterX / CELL_SIZE)
const dy = inverseSigmoid(dCenterY / CELL_SIZE)
//const dx = dCenterX / CELL_SIZE
//const dy = dCenterY / CELL_SIZE
const dw = Math.log((width / CELL_SIZE) / getAnchors()[anchor].x)
const dh = Math.log((height / CELL_SIZE) / getAnchors()[anchor].y)
const anchorOffset = anchor * 5
buf.set(dx, row, col, anchorOffset + 0)
buf.set(dy, row, col, anchorOffset + 1)
buf.set(dw, row, col, anchorOffset + 2)
buf.set(dh, row, col, anchorOffset + 3)
})
return adjustments
}
function computeIous(predBoxes, groundTruthBoxes, reshapedImgDims) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
const numCells = getNumCells(inputSize)
const isSameAnchor = p1 => p2 =>
p1.row === p2.row
&& p1.col === p2.col
&& p1.anchor === p2.anchor
const ious = tf.zeros([numCells, numCells, 25])
const buf = ious.buffer()
groundTruthBoxes.forEach(({ row, col, anchor, box }) => {
const predBox = predBoxes.find(isSameAnchor({ row, col, anchor }))
if (!predBox) {
console.log(groundTruthBoxes)
console.log(predBoxes)
throw new Error(`no output box found for: row ${row}, col ${col}, anchor ${anchor}`)
}
const iou = faceapi.iou(
box.rescale(reshapedImgDims),
predBox.box.rescale(reshapedImgDims)
)
if (window.debug) {
console.log('ground thruth box:', box.rescale(reshapedImgDims).toRect())
console.log('predicted box:', predBox.box.rescale(reshapedImgDims).toRect())
console.log('predicted score:', predBox.score)
console.log('iou:', iou)
}
const anchorOffset = anchor * 5
buf.set(iou, row, col, anchorOffset + 4)
})
return ious
}
window.computeNoObjectLoss = function(outTensor, mask) {
return tf.tidy(() => {
const lossTensor = tf.sigmoid(outTensor)
return squaredSumOverMask(lossTensor, mask)
})
}
function computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, mask) {
return tf.tidy(() => {
const predBoxes = window.net.postProcess(
outTensor,
{ paddings }
)
const ious = computeIous(
predBoxes,
groundTruthBoxes,
reshapedImgDims
)
const lossTensor = tf.sub(ious, tf.sigmoid(outTensor))
return squaredSumOverMask(lossTensor, mask)
})
}
function computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, mask, paddings) {
return tf.tidy(() => {
const boxAdjustments = computeBoxAdjustments(
groundTruthBoxes,
reshapedImgDims
)
if (window.debug) {
const indToPos = []
const numCells = outTensor.shape[1]
for (let row = 0; row < numCells; row++) {
for (let col = 0; col < numCells; col++) {
for (let anchor = 0; anchor < 25; anchor++) {
indToPos.push({ row, col, anchor: parseInt(anchor / 5) })
}
}
}
const indices = Array.from(mask.dataSync()).map((val, ind) => ({ val, ind })).filter(v => v.val !== 0).map(v => v.ind)
const gt = Array.from(boxAdjustments.dataSync())
const out = Array.from(outTensor.dataSync())
const comp = indices.map(i => (
{
pos: indToPos[i],
gt: gt[i],
out: out[i]
}
))
console.log(comp.map(c => `gt: ${c.gt}, out: ${c.out}`))
const getBbox = (which) => {
const { row, col, anchor } = comp[0].pos
const ctX = ((col + faceapi.sigmoid(comp[0][which])) / numCells) * paddings.x
const ctY = ((row + faceapi.sigmoid(comp[1][which])) / numCells) * paddings.y
const width = ((Math.exp(comp[2][which]) * getAnchors()[anchor].x) / numCells) * paddings.x
const height = ((Math.exp(comp[3][which]) * getAnchors()[anchor].y) / numCells) * paddings.y
const x = (ctX - (width / 2))
const y = (ctY - (height / 2))
return new faceapi.BoundingBox(x, y, x + width, y + height)
}
const outRect = getBbox('out').rescale(reshapedImgDims).toRect()
const gtRect = getBbox('gt').rescale(reshapedImgDims).toRect()
console.log('out', outRect)
console.log('gtRect', gtRect)
}
const lossTensor = tf.sub(boxAdjustments, outTensor)
return squaredSumOverMask(lossTensor, mask)
})
}
function computeLoss(outTensor, groundTruth, reshapedImgDims, paddings) {
const inputSize = Math.max(reshapedImgDims.width, reshapedImgDims.height)
if (!inputSize) {
throw new Error(`invalid inputSize: ${inputSize}`)
}
let groundTruthBoxes = assignBoxesToAnchors(
groundTruth
.map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
.map(rect => rect.toBoundingBox()),
reshapedImgDims
)
const groundTruthMask = getGroundTruthMask(groundTruthBoxes, inputSize)
const { coordMask, scoreMask } = getCoordAndScoreMasks(inputSize)
const noObjectLossMask = tf.tidy(() => tf.mul(scoreMask, tf.sub(tf.scalar(1), groundTruthMask)))
const objectLossMask = tf.tidy(() => tf.mul(scoreMask, groundTruthMask))
const coordLossMask = tf.tidy(() => tf.mul(coordMask, groundTruthMask))
const noObjectLoss = tf.tidy(() =>
tf.mul(
tf.scalar(noObjectScale),
computeNoObjectLoss(outTensor, noObjectLossMask)
)
)
const objectLoss = tf.tidy(() =>
tf.mul(
tf.scalar(objectScale),
computeObjectLoss(outTensor, groundTruthBoxes, reshapedImgDims, paddings, objectLossMask)
)
)
const coordLoss = tf.tidy(() =>
tf.mul(
tf.scalar(coordScale),
computeCoordLoss(groundTruthBoxes, outTensor, reshapedImgDims, coordLossMask, paddings)
)
)
const totalLoss = tf.tidy(() => noObjectLoss.add(objectLoss).add(coordLoss))
return {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
}
}
\ No newline at end of file
import * as _tf from '@tensorflow/tfjs-core';
const faceapi = require('../../../dist/face-api.js')
const tf: typeof _tf = faceapi.tf
require('./loss')
window['faceapi'] = faceapi
window['tf'] = tf
const anchors = [
new faceapi.Point(1.603231, 2.094468),
new faceapi.Point(6.041143, 7.080126),
new faceapi.Point(2.882459, 3.518061),
new faceapi.Point(4.266906, 5.178857),
new faceapi.Point(9.041765, 10.66308)
]
window['net'] = {
getAnchors() {
return anchors
}
}
describe('loss', () => {
describe('computeNoObjectLoss', () => {
const computeNoObjectLoss = window['computeNoObjectLoss']
it('should only compute loss over scores, 1x1 grid', () => tf.tidy(() => {
const outTensor = tf.zeros([1, 1, 1, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5)
}))
it('should only compute loss over scores, 13x13 grid', () => tf.tidy(() => {
const outTensor = tf.zeros([1, 13, 13, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13)
}))
it('should only compute loss over scores, 13x13 grid, batchSize: 10', () => tf.tidy(() => {
const outTensor = tf.zeros([10, 13, 13, 25])
const loss = tf.sum(computeNoObjectLoss(outTensor)).dataSync()[0]
expect(loss).toEqual(0.5 * 0.5 * 5 * 13 * 13 * 10)
}))
})
})
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<script src="trainUtils.js"></script>
<script src="loss.js"></script>
<script src="FileSaver.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<script>
window.tf = faceapi.tf
// hyper parameters
window.objectScale = 5
window.noObjectScale = 1
window.coordScale = 1
const weightsUrl = `/tmp/tmp__224_35060__320_41188__416_31050__608_16520.weights`
//const inputSizes = [160, 224, 320, 416]
const inputSizes = [512, 608]
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchFddbJson() {
return fetch('/fddb-detections.json').then(res => res.json())
}
async function run() {
window.fddbJson = await fetchFddbJson()
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
await window.net.load(weights)
await promiseSequential(inputSizes.map(inputSize => async () => {
await promiseSequential(window.fddbJson.map(({ filePath, rects }) => () => {
return test(filePath, rects, inputSize)
}))
const losses = Object.keys(window.lossMap[inputSize]).map(k => window.lossMap[inputSize][k])
const totalLoss = losses
.map(l => l.totalLoss)
.reduce((sum, l) => sum + l)
const avgLoss = totalLoss / losses.length
log(`totalLoss (${inputSize}): ${totalLoss}`)
log(`avgLoss (${inputSize}): ${avgLoss}`)
window.losses = window.losses || {}
window.losses[inputSize] = { totalLoss, avgLoss }
}))
console.log(window.losses)
}
async function test(fileUri, rects, inputSize) {
const img = await faceapi.bufferToImage(await fetchImage(fileUri))
const groundTruthBoxes = rects
.map(({ x, y, width, height }) => new faceapi.Rect(x, y, width, height))
.map(rect => rect.clipAtImageBorders(img.width, img.height))
.map(({ x, y, width, height }) => ({
x: x / img.width,
y: y / img.height,
width: width / img.width,
height: height / img.height,
}))
const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
const squareImg = imageToSquare(img)
const netInput = (await faceapi.toNetInput(squareImg)).managed()
const losses = tf.tidy(() => {
const outTensor = window.net.forwardInput(netInput, inputSize)
const {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
} = computeLoss(
outTensor,
groundTruthBoxes,
reshapedImgDims,
paddings
)
const losses = {
totalLoss: totalLoss.dataSync()[0],
noObjectLoss: noObjectLoss.dataSync()[0],
objectLoss: objectLoss.dataSync()[0],
coordLoss: coordLoss.dataSync()[0]
}
return losses
})
log(`${fileUri}:`)
log(`ground truth boxes: ${groundTruthBoxes.length}`)
log(`noObjectLoss: ${losses.noObjectLoss}`)
log(`objectLoss: ${losses.objectLoss}`)
log(`coordLoss: ${losses.coordLoss}`)
log(`totalLoss: ${losses.totalLoss}`)
if (Object.keys(losses).map(k => losses[k]).some(loss => isNaN(loss) || loss === Infinity)) {
console.log(groundTruthBoxes)
console.log(img)
console.log(losses)
throw new Error('corrupted loss value')
}
window.lossMap = window.lossMap || {}
window.lossMap[inputSize] = window.lossMap[inputSize] || {}
window.lossMap[inputSize][fileUri] = losses
}
$(document).ready(function() {
run()
})
</script>
</body>
</html>
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<script src="FileSaver.js"></script>
<script src="trainUtils.js"></script>
<script src="train.js"></script>
<script src="loss.js"></script>
</head>
<body>
<script>
tf = faceapi.tf
const startIdx160 = 26600
const startIdx224 = 61660
const startIdx320 = 67788
const startIdx416 = 57650
const startIdx608 = 16520
//const weightsUrl = `/tmp/tmp__160_${startIdx160}__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
const weightsUrl = `/tmp/tmp_multiscale_count_8700.weights`
const fromEpoch = 0
const trainOnlyMultibox = false
const trainSizes = [416, 512, 608]
//const trainSizes = [608]
window.debug = false
window.logTrainSteps = true
window.count = 0
// hyper parameters
window.objectScale = 5
window.noObjectScale = 1
window.coordScale = 1
const rescaleEveryNthBatch = 100
window.saveEveryNthDataIdx = trainSizes.length * rescaleEveryNthBatch
window.trainSteps = 4000
//window.optimizer = tf.train.sgd(0.001)
window.optimizer = tf.train.adam(0.001, 0.9, 0.999, 1e-8)
// all samples
//const dataStartIdx = 8000
const dataStartIdx = 0
const numTrainSamples = Infinity
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchDetectionFilenames() {
return fetch('/detection_filenames').then(res => res.json())
}
async function fetchDetectionFilenamesMultibox() {
return fetch('/detection_filenames_multibox').then(res => res.json())
}
async function run() {
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
window.net.load(weights)
window.net.variable()
const fetchDetectionsFn = trainOnlyMultibox
? fetchDetectionFilenamesMultibox
: fetchDetectionFilenames
window.detectionFilenames = (await fetchDetectionsFn()).slice(dataStartIdx, dataStartIdx + numTrainSamples)
window.lossMap = {}
console.log('ready')
}
function logLossChange(lossType) {
const { currentLoss, prevLoss, detectionFilenames } = window
log(`${lossType} : ${faceapi.round(currentLoss[lossType])} (avg: ${faceapi.round(currentLoss[lossType] / detectionFilenames.length)}) (delta: ${currentLoss[lossType] - prevLoss[lossType]})`)
}
function onBatchProcessed(dataIdx, inputSize) {
window.count++
const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
console.log('dataIdx', dataIdx)
if ((window.count % saveEveryNthDataIdx) === 0) {
saveWeights(window.net, `tmp_multiscale_count_${window.count}.weights`)
}
}
function _onBatchProcessed(dataIdx, inputSize) {
const idx = (dataIdx + 1) + (window.epoch * window.detectionFilenames.length)
console.log('idx', idx)
if ((idx % saveEveryNthDataIdx) === 0) {
saveWeights(window.net, `tmp__224_${startIdx224 + (inputSize === 224 ? idx : 0)}__320_${startIdx320 + (inputSize === 320 ? idx : 0)}__416_${startIdx416 + (inputSize === 416 ? idx : 0)}__608_${startIdx608 + (inputSize === 608 ? idx : 0)}.weights`)
}
}
async function train() {
const batchSize = 1
for (let i = fromEpoch; i < trainSteps; i++) {
window.epoch = i
log('step', i)
let ts2 = Date.now()
const batchCreators = createBatchCreators(shuffle(window.detectionFilenames), batchSize)
await trainStep(batchCreators, trainSizes, rescaleEveryNthBatch, onBatchProcessed)
ts2 = Date.now() - ts2
}
ts = Date.now() - ts
log()
log('--------------------')
log()
log('step %s done (%s ms)', i, ts)
window.prevLoss = window.currentLoss
window.currentLoss = Object.keys(lossMap)
.map(filename => lossMap[filename])
.reduce((accumulatedLosses, losses) =>
Object.keys(losses)
.map(key => ({
[key]: (accumulatedLosses[key] || 0) + losses[key]
}))
.reduce((map, curr) => ({ ...map, ...curr }), {}),
{}
)
if (window.prevLoss) {
logLossChange('noObjectLoss')
logLossChange('objectLoss')
logLossChange('coordLoss')
logLossChange('totalLoss')
}
log()
log('--------------------')
log()
}
run()
</script>
</body>
</html>
\ No newline at end of file
const batchIdx = 0
function minimize(groundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings }) {
const filename = batch.filenames[batchIdx]
const { dataIdx } = batch
return optimizer.minimize(() => {
const outTensor = window.net.forwardInput(batchInput, inputSize)
const {
noObjectLoss,
objectLoss,
coordLoss,
totalLoss
} = computeLoss(
outTensor,
groundTruthBoxes,
reshapedImgDims,
paddings
)
const losses = {
totalLoss: totalLoss.dataSync()[0],
noObjectLoss: noObjectLoss.dataSync()[0],
objectLoss: objectLoss.dataSync()[0],
coordLoss: coordLoss.dataSync()[0]
}
const lossKey = `${filename}_${inputSize}`
if (window.logTrainSteps) {
log(`ground truth boxes: ${groundTruthBoxes.length}`)
log(`noObjectLoss[${dataIdx}]: ${losses.noObjectLoss}`)
log(`objectLoss[${dataIdx}]: ${losses.objectLoss}`)
log(`coordLoss[${dataIdx}]: ${losses.coordLoss}`)
log(`totalLoss[${dataIdx}]: ${losses.totalLoss}`)
if (window.lossMap[lossKey]) {
log(`loss change: ${losses.totalLoss - window.lossMap[lossKey].totalLoss}`)
}
}
window.lossMap[lossKey] = losses
return totalLoss
}, true)
}
async function trainStep(batchCreators, inputSizes, rescaleEveryNthBatch, onBatchProcessed = () => {}) {
async function step(currentBatchCreators) {
if (!currentBatchCreators.batchCreators.length) {
return
}
await promiseSequential(inputSizes.map(inputSize => async () => {
await promiseSequential(currentBatchCreators.batchCreators.map(batchCreator => async () => {
const batch = await batchCreator()
const { imgs, groundTruthBoxes, filenames, dataIdx } = batch
const img = imgs[0]
const { reshapedImgDims, paddings } = getPaddingsAndReshapedSize(img, inputSize)
const squareImg = imageToSquare(img)
const batchInput = await faceapi.toNetInput(squareImg)
const [imgHeight, imgWidth] = batchInput.inputs[batchIdx].shape
// skip groundTruthBoxes, which are too tiny
const scaleFactor = inputSize / Math.max(imgHeight, imgWidth)
const filteredGroundTruthBoxes = groundTruthBoxes[batchIdx].filter(({ x, y, width, height }) => {
const box = (new faceapi.Rect(x, y, width, height))
.toBoundingBox()
.rescale({ height: imgHeight, width: imgWidth })
.rescale(scaleFactor)
const isTooTiny = box.width < 40 || box.height < 40
if (isTooTiny && window.debug) {
log(`skipping box for input size ${inputSize}: (${Math.floor(box.width)} x ${Math.floor(box.height)})`)
}
return !isTooTiny
})
if (!filteredGroundTruthBoxes.length) {
if (window.debug) {
log(`no boxes for input size ${inputSize}, ${groundTruthBoxes[batchIdx].length} boxes were too small`)
}
batchInput.dispose()
onBatchProcessed(dataIdx, inputSize)
return
}
let ts = Date.now()
const loss = minimize(filteredGroundTruthBoxes, batchInput, inputSize, batch, { reshapedImgDims, paddings })
ts = Date.now() - ts
if (window.logTrainSteps) {
log(`trainStep time for dataIdx ${dataIdx} (${inputSize}): ${ts} ms`)
}
loss.dispose()
batchInput.dispose()
onBatchProcessed(dataIdx, inputSize)
await tf.nextFrame()
}))
}))
await step(currentBatchCreators.next(rescaleEveryNthBatch))
}
await step(batchCreators.next(rescaleEveryNthBatch))
}
async function fetchGroundTruthBoxesForFile(file) {
const boxes = await fetch(file).then(res => res.json())
return {
file,
boxes
}
}
function createBatchCreators(detectionFilenames, batchSize) {
if (batchSize < 1) {
throw new Error('invalid batch size: ' + batchSize)
}
const batches = []
const pushToBatch = (remaining) => {
if (remaining.length) {
batches.push(remaining.slice(0, batchSize))
pushToBatch(remaining.slice(batchSize))
}
return batches
}
pushToBatch(detectionFilenames)
const batchCreators = batches.map((filenamesForBatch, dataIdx) => async () => {
const groundTruthBoxes = (await Promise.all(filenamesForBatch.map(fetchGroundTruthBoxesForFile)))
.map(({ boxes }) => boxes)
const imgs = await Promise.all(filenamesForBatch.map(
async file => await faceapi.bufferToImage(await fetchImage(file.replace('.json', '')))
))
return {
imgs,
groundTruthBoxes,
filenames: filenamesForBatch,
dataIdx
}
})
let idx = 0
function next(n) {
const nextBatchCreators = batchCreators.slice(idx, idx + n)
idx += n
return {
batchCreators: nextBatchCreators,
next
}
}
return {
data: batchCreators,
next
}
}
\ No newline at end of file
<!DOCTYPE html>
<html>
<head>
<script src="face-api.js"></script>
<script src="commons.js"></script>
<link rel="stylesheet" href="styles.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/css/materialize.css">
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/0.100.2/js/materialize.min.js"></script>
</head>
<body>
<div id="navbar"></div>
<div class="center-content page-container">
<div class="progress" id="loader">
<div class="indeterminate"></div>
</div>
<div style="position: relative" class="margin">
<img id="inputImg" src="" style="max-width: 800px;" />
<canvas id="overlay" />
</div>
<div class="row side-by-side">
<div id="selectList"></div>
<div class="row">
<label for="imgUrlInput">Get image from URL:</label>
<input id="imgUrlInput" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="loadImageFromUrl()"
>
Ok
</button>
</div>
<div class="row side-by-side">
<div class="row input-field" style="margin-right: 20px;">
<select id="sizeType">
<option value="128">128 x 128</option>
<option value="160">160 x 160</option>
<option value="224">224 x 224</option>
<option value="320">320 x 320</option>
<option value="416">416 x 416</option>
<option value="608">608 x 608</option>
</select>
<label>Input Size</label>
</div>
<div class="row">
<label for="scoreThreshold">Score Threshold:</label>
<input disabled value="0.5" id="scoreThreshold" type="text" class="bold">
</div>
<button
class="waves-effect waves-light btn"
onclick="onDecreaseThreshold()"
>
<i class="material-icons left">-</i>
</button>
<button
class="waves-effect waves-light btn"
onclick="onIncreaseThreshold()"
>
<i class="material-icons left">+</i>
</button>
</div>
<div class="row">
<label for="imgByNr">Enter image NR: </label>
<input id="imgByNr" type="text" class="bold">
</div>
</div>
<script>
let scoreThreshold = 0.5
let sizeType = 608
function onKeyDown(e) {
e.target.value = (
parseInt(e.target.value) + (e.keyCode === 38 ? 1 : (e.keyCode === 40 ? -1 : 0))
) || e.target.value || 0
const imgUri = window.imgs[e.target.value]
console.log(imgUri)
onSelectionChanged(imgUri)
}
function onIncreaseThreshold() {
scoreThreshold = Math.min(faceapi.round(scoreThreshold + 0.1), 1.0)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onDecreaseThreshold() {
scoreThreshold = Math.max(faceapi.round(scoreThreshold - 0.1), 0.1)
$('#scoreThreshold').val(scoreThreshold)
updateResults()
}
function onSizeTypeChanged(e, c) {
sizeType = e.target.value
$('#sizeType').val(sizeType)
updateResults()
}
async function loadImageFromUrl(url) {
const img = await requestExternalImage($('#imgUrlInput').val())
$('#inputImg').get(0).src = img.src
updateResults()
}
async function updateResults() {
const inputImgEl = $('#inputImg').get(0)
const { width, height } = inputImgEl
const canvas = $('#overlay').get(0)
canvas.width = width
canvas.height = height
const forwardParams = {
inputSize: parseInt(sizeType),
scoreThreshold
}
const detections = await window.net.locateFaces(inputImgEl, forwardParams)
faceapi.drawDetection('overlay', detections.map(det => det.forSize(width, height)))
}
async function onSelectionChanged(uri) {
const imgBuf = await fetchImage(uri)
$(`#inputImg`).get(0).src = (await faceapi.bufferToImage(imgBuf)).src
updateResults()
}
async function loadNetWeights(uri) {
return new Float32Array(await (await fetch(uri)).arrayBuffer())
}
async function fetchDetectionFilenames() {
return fetch('/detection_filenames').then(res => res.json())
}
async function run() {
$('#imgByNr').keydown(onKeyDown)
const startIdx224 = 35060
const startIdx320 = 41188
const startIdx416 = 31050
const startIdx608 = 16520
//const weightsUrl = `/tmp/tmp__224_${startIdx224}__320_${startIdx320}__416_${startIdx416}__608_${startIdx608}.weights`
const weightsUrl = `/tmp/tmp_multiscale_count_4200.weights`
const weights = await loadNetWeights(weightsUrl)
window.net = new faceapi.TinyYolov2(true)
await window.net.load(weights)
window.imgs = (await fetchDetectionFilenames()).map(f => f.replace('.json', ''))
$('#loader').hide()
onSelectionChanged($('#selectList select').val())
}
$(document).ready(function() {
renderNavBar('#navbar', 'tiny_yolov2_face_detection')
renderImageSelectList(
'#selectList',
async (uri) => {
await onSelectionChanged(uri)
},
'bbt1.jpg'
)
const sizeTypeSelect = $('#sizeType')
sizeTypeSelect.val(sizeType)
sizeTypeSelect.on('change', onSizeTypeChanged)
sizeTypeSelect.material_select()
run()
})
</script>
</body>
</html>
\ No newline at end of file
{
"extends": "../../tsconfig.json",
"include": [
"tinyYolov2"
]
}
[{"weights":[{"name":"conv0/conv/filters","shape":[3,3,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0036540280370151294,"min":-0.44213739247883066}},{"name":"conv0/conv/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002164303555208094,"min":-0.13418682042290184}},{"name":"conv0/bn/sub","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0028029036288167914,"min":-0.3139252064274806}},{"name":"conv0/bn/truediv","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.05010388411727606,"min":2.7867696285247803}},{"name":"conv1/conv/filters","shape":[3,3,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0017398431020624497,"min":-0.22617960326811845}},{"name":"conv1/conv/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001145745112615473,"min":-0.12488621727508656}},{"name":"conv1/bn/sub","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00586619587505565,"min":-0.9503237317590153}},{"name":"conv1/bn/truediv","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.007056442896525065,"min":1.2661800384521484}},{"name":"conv2/conv/filters","shape":[3,3,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0010545693776186774,"min":-0.12971203344709734}},{"name":"conv2/conv/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0005329197820495157,"min":-0.07034541123053607}},{"name":"conv2/bn/sub","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.005863590567719703,"min":-0.691903686990925}},{"name":"conv2/bn/truediv","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.006374212339812634,"min":1.4001796245574951}},{"name":"conv3/conv/filters","shape":[3,3,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006935241175632851,"min":-0.08114232175490436}},{"name":"conv3/conv/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0002854522025468303,"min":-0.04224692597693088}},{"name":"conv3/bn/sub","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008516784275279325,"min":-1.1667994457132675}},{"name":"conv3/bn/truediv","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004096750652088838,"min":1.1394747495651245}},{"name":"conv4/conv/filters","shape":[3,3,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0006537227946169236,"min":-0.06733344784554313}},{"name":"conv4/conv/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00018004616905076831,"min":-0.041770711219778246}},{"name":"conv4/bn/sub","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011157989969440535,"min":-1.5509606057522343}},{"name":"conv4/bn/truediv","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00421752836190018,"min":1.1967103481292725}},{"name":"conv5/conv/filters","shape":[3,3,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0008280676077393925,"min":-0.05299632689532112}},{"name":"conv5/conv/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00007708504312105623,"min":-0.020973851904273033}},{"name":"conv5/bn/sub","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.011612189049814262,"min":-1.7302161684223252}},{"name":"conv5/bn/truediv","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0034869993434232826,"min":1.2963157892227173}},{"name":"conv6/conv/filters","shape":[3,3,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0003250038945207409,"min":-0.04030048292057187}},{"name":"conv6/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00009164303220114009,"min":-0.02338002622127533}},{"name":"conv6/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0168181017333386,"min":-2.068626513200648}},{"name":"conv6/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004455030198190726,"min":1.309550166130066}},{"name":"conv7/conv/filters","shape":[3,3,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00026194834533859704,"min":-0.033267439858001825}},{"name":"conv7/conv/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0001491123554753322,"min":-0.011928988438026577}},{"name":"conv7/bn/sub","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009263983427309523,"min":-1.1765258952683093}},{"name":"conv7/bn/truediv","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01123507022857666,"min":0.9360886812210083}},{"name":"conv8/filters","shape":[1,1,1024,30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.001922343233052422,"min":-0.2614386796951294}},{"name":"conv8/bias","shape":[30],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0009638834233377494,"min":-0.15904076485072866}}],"paths":["tiny_yolov2_model-shard1","tiny_yolov2_model-shard2","tiny_yolov2_model-shard3","tiny_yolov2_model-shard4"]}]
\ No newline at end of file
[{"weights":[{"name":"conv0/depthwise_filter","shape":[3,3,3,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004699238725737029,"min":-0.7471789573921876}},{"name":"conv0/pointwise_filter","shape":[1,1,3,16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008118405529097015,"min":-1.071629529840806}},{"name":"conv0/bias","shape":[16],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0024678509609372006,"min":-0.28873856242965246}},{"name":"conv1/depthwise_filter","shape":[3,3,16,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004553892331964829,"min":-0.5737904338275684}},{"name":"conv1/pointwise_filter","shape":[1,1,16,32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.00980057996862075,"min":-1.3230782957638012}},{"name":"conv1/bias","shape":[32],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0011220066278588537,"min":-0.20644921952602907}},{"name":"conv2/depthwise_filter","shape":[3,3,32,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0032098570290733787,"min":-0.38839270051787883}},{"name":"conv2/pointwise_filter","shape":[1,1,32,64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.008682825051101984,"min":-1.154815731796564}},{"name":"conv2/bias","shape":[64],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0015120926440930834,"min":-0.21471715546121783}},{"name":"conv3/depthwise_filter","shape":[3,3,64,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003597520496331009,"min":-0.4317024595597211}},{"name":"conv3/pointwise_filter","shape":[1,1,64,128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010341314240997913,"min":-1.3650534798117246}},{"name":"conv3/bias","shape":[128],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.002109630785736383,"min":-0.4113780032185947}},{"name":"conv4/depthwise_filter","shape":[3,3,128,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004783747476689955,"min":-0.6171034244930043}},{"name":"conv4/pointwise_filter","shape":[1,1,128,256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009566552498761345,"min":-1.2627849298364977}},{"name":"conv4/bias","shape":[256],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0020002245903015135,"min":-0.3860433459281921}},{"name":"conv5/depthwise_filter","shape":[3,3,256,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004355777244941861,"min":-0.4791354969436047}},{"name":"conv5/pointwise_filter","shape":[1,1,256,512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010036561068366555,"min":-1.2545701335458193}},{"name":"conv5/bias","shape":[512],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0023248311935686597,"min":-0.42776893961663337}},{"name":"conv6/depthwise_filter","shape":[3,3,512,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004659063442080629,"min":-0.5963601205863205}},{"name":"conv6/pointwise_filter","shape":[1,1,512,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.010061494509379069,"min":-1.2576868136723836}},{"name":"conv6/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.0029680932269376867,"min":-0.3947563991827123}},{"name":"conv7/depthwise_filter","shape":[3,3,1024,1],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.003887363508635876,"min":-0.48980780208812036}},{"name":"conv7/pointwise_filter","shape":[1,1,1024,1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009973861189449535,"min":-1.2766542322495404}},{"name":"conv7/bias","shape":[1024],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.004667898486642276,"min":-0.6955168745096991}},{"name":"conv8/filters","shape":[1,1,1024,25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.01573958116419175,"min":-2.5340725674348716}},{"name":"conv8/bias","shape":[25],"dtype":"float32","quantization":{"dtype":"uint8","scale":0.009396760662396749,"min":-2.2552225589752197}}],"paths":["tiny_yolov2_separable_conv_model-shard1"]}]
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment