Due date: Wednesday, May 13, 11:59pm.
Mandatory weekly progress reports are due every Monday at 11:59pm, starting Monday 04/20, and ending Monday 05/11.
The project constitutes 25% of the course grade. Students who have not already obtained the instructors consent on a different project must do the project specified here. After April
13, weekly progress reports by all students will be due weekly (by
e-mail to the instructor and the GTAs), each Monday at 11:59pm, until
the project is submitted.
Task
Implement a face detector that is trained using AdaBoost, combines information from skin color and rectangle filters, and utilizes the ideas of bootstrapping and classifier cascades. In particular, the following methods/concepts must be utilized in this face detector:
- AdaBoost: The face detector, or at least components of the face detector, must be trained using AdaBoost and rectangle filters.
- Skin detection: A skin detector must be used to improve the efficiency of face detection on color images.
- Bootstrapping: Bootstrapping is a method for improving the quality of the training set, by identifying and including more challenging examples. Bootstrapping is performed by iterating between 1). Training a face detector using the training set, and 2). Applying the face detector on additional data, and adding to the training set cases where the face detector makes mistakes.
- (CSE 6367 only) Classifier cascades: An AdaBoost-based detector was implemented in the code posted for Lecture 13. In that detector, the same strong classifier was applied to every single window of the image. A classifier cascade is a sequence of classifiers, where the first classifier is very fast but relatively inaccurate, and each subsequent classifier is slower but more accurate. For every classifier in the cascade (except for the final classifier), we need to choose a threshold that determines whether a window should be classified as nonface, or should be passed on to the next classifier in the cascade. That threshold should be chosen so that it causes as few mistakes as possible. See details on Lecture 20.
Data
All the data that you need from this project can be downloaded from zipped file training_test_data.zip, which has size 47MB. You can access individual files from directory training_test_data, When you train your system, you can use windows from images in directory training_faces as positive examples. As negative examples, you can use windows from images in directory training_nonfaces. For bootstrapping, once you have trained a detector, you should apply it to all images in training_faces and training_nonfaces, identify windows where the detector makes mistakes, add those windows to the training set, and retrain.
As test data, you can use the images in directories test_cropped_faces, test_face_photos, and test_nonfaces. You are not allowed to use any of these test data for training.
Grading
Grading will be based on how well you applied knowledge that you obtained in this course in order to design a detector that is accurate and efficient. For CSE 6367, 22.5% of the project grade will be assigned to each of the four components of the system (AdaBoost, skin detection, bootstrapping, cascades). Another 10% will be based on the four weekly progress reports, and will consider timeliness of submission, quality of description, and evidence of progress.
For CSE 4392, 30% of the project grade will be assigned to each of the three mandatory components of the system (AdaBoost, skin detection, bootstrapping), and 10% will be based on the four weekly progress reports. Implementing classifier cascades is worth 5% extra credit.
In addition to correctness and quality of implementation, you will also be graded based on the decisions and choices you make in building your system. You will have to make several decisions, including:
- How many training examples to use for AdaBoost (what is too many? what is too few?)
- What image windows to use as positive examples and as negative examples.
- When to stop the bootstrapping process of training new detectors and using those detectors to identify cases that should be added to the training examples.
- How exactly to use skin detection to improve the accuracy and efficiency of face detection. Choices that can be justified based on performance on training data will be preferred, and how well those choices are justified will also be a criterion for grading.
- How many cascades to use, and how to design each of the cascades. You should justify the choices you make, and indicate why you feel that those choices are appropriate.
In general, this project is intended to be a simulation of a project that you could be assigned when working in the real world. In such projects, much less is specified that in a typical homework assignment; the system designer needs to evaluate different choices at each step, and finally make choices that lead to a good product/system. During this course you have learned a variety of different computer vision methods, and you have also encountered several different approaches for making system design choices and for justifying those choices. This is an opportunity to use what you have learned.
Presentations
Students will need to present their project implementations, in a 5-10 minute presentation, for which slides should be prepared. Presentation times will be during finals week, and will be arranged by the instructor. The presentations should specify the main choices that were made in designing the system, and the accuracy/efficiency of the results that were obtained on the test data.