layout: doc-page title: Eigenfaces Demo
Credit: original blog post by rawkintrevo. This will be maintained through version changes, blog post will not.
Eigenfaces are an image equivelent(ish) to eigenvectors if you recall your high school linear algebra classes. If you don't recall: read wikipedia otherwise, it is a set of ‘faces’ that by a linear combination can be used to represent other faces.
Their are lots of “image recognition” things out there right now, and deep learning is the popular one everyone is talking about. Deep learning will admittedly do better a recognizing and correctly classifying faces, however it does so at a price.
The advantage/use-case for the eigenfaces approach is when new faces are being regularly added. Even when building a production grade eigenfaces based system- neural networks still have a place- idenitifying faces in images, and creating centered and scaled images around the face. This is scalable because we only need to train our neural network to detect, center, and scale faces once. E.g. a neural network would be deployed as a microservice, and then eigenfaces would be deployed as a microservice.
A production version ends up looking something like this:
The first thing we're going to do is collect a set of 13,232 face images (250x250 pixels) from the Labeled Faces in the Wild data set.
cd /tmp mkdir eigenfaces wget http://vis-www.cs.umass.edu/lfw/lfw-deepfunneled.tgz tar -xzf lfw-deepfunneled.tgz
cd $MAHOUT_HOME/bin ./mahout spark-shell \ --packages com.sksamuel.scrimage:scrimage-core_2.10:2.1.0, \ com.sksamuel.scrimage:scrimage-io-extra_2.10:2.1.0, \ com.sksamuel.scrimage:scrimage-filters_2.10:2.1.0
import com.sksamuel.scrimage._ import com.sksamuel.scrimage.filter.GrayscaleFilter val imagesRDD:DrmRdd[Int] = sc.binaryFiles("/tmp/lfw-deepfunneled/*/*", 500) .map(o => new DenseVector( Image.apply(o._2.toArray) .filter(GrayscaleFilter) .pixels .map(p => p.toInt.toDouble / 10000000)) ) .zipWithIndex .map(o => (o._2.toInt, o._1)) val imagesDRM = drmWrap(rdd= imagesRDD).par(min = 500).checkpoint() println(s"Dataset: ${imagesDRM.nrow} images, ${imagesDRM.ncol} pixels per image")
import org.apache.mahout.math.algorithms.preprocessing.MeanCenter val scaler: MeanCenterModel = new MeanCenter().fit(imagesDRM) val centeredImages = scaler.transform(imagesDRM)
import org.apache.mahout.math._ import decompositions._ import drm._ val(drmU, drmV, s) = dssvd(centeredImages, k= 20, p= 15, q = 0)
import java.io.File import javax.imageio.ImageIO val sampleImagePath = "/home/guest/lfw-deepfunneled/Aaron_Eckhart/Aaron_Eckhart_0001.jpg" val sampleImage = ImageIO.read(new File(sampleImagePath)) val w = sampleImage.getWidth val h = sampleImage.getHeight val eigenFaces = drmV.t.collect(::,::) val colMeans = scaler.colCentersV for (i <- 0 until 20){ val v = (eigenFaces(i, ::) + colMeans) * 10000000 val output = new Array[com.sksamuel.scrimage.Pixel](v.size) for (i <- 0 until v.size) { output(i) = Pixel(v.get(i).toInt) } val image = Image(w, h, output) image.output(new File(s"/tmp/eigenfaces/${i}.png")) }
If using Zeppelin, the following can be used to generate a fun table of the Eigenfaces:
%python r = 4 c = 5 print '%html\n<table style="width:100%">' + "".join(["<tr>" + "".join([ '<td><img src="/tmp/eigenfaces/%i.png"></td>' % (i + j) for j in range(0, c) ]) + "</tr>" for i in range(0, r * c, r +1 ) ]) + '</table>'