Python

for

Computer Vision

Jarrell Waggoner / @malloc47

Online: malloc47.com/ATO2013/
Code: github.com/malloc47/ato2013-code/

About Me

Computer Vision

  • Subset of AI
  • Concerned with problems that involve visual perception
  • Well-known accomplishments:
    • Face detection
    • Kinect
    • Eye/Fingerprint Identification
    • OCR

Levels of CV

  • Low-level (image processing)
  • Mid-level (features)
  • High-level (human)

Doing CV


  • Open Source
  • High-level
  • Multi-paradigm
  • Extensible
  • Library-rich

SciPy/NumPy

  • General scientific and numeric computing libraries
  • Supplies algorithms and datatypes for most Python CV libraries

scikit-image

  • In the family of scikits, analogous to Matlab toolboxes
  • Similar to Matlab's image processing toolbox
  • Focuses on image processing and mid-level vision
  • Useful for
    • Morphological operations
    • Edge/corner detection
    • Image filtering
    • Segmentation

  • Similar to Matlab's Machine Learning toolbox
  • Contains well-known classifiers, clustering, metrics, etc.
  • Useful for many vision applications where relationships between features aren't well-established
  • Useful for
    • OCR
    • Object detection
    • Object identification
    • Image matching

  • Originally an Intel project launched in 1999
  • Now run by non-profit OpenCV.org
  • C/C++ interface with some Python bindings
  • Geared toward high-level CV
  • Useful for
    • Face recognition
    • Motion tracking
    • Some learning algorithms
    • Robotics

Others

  • Scikit-learn: Machine learning
  • SimpleCV: Companion to Practical Computer Vision book
  • pymorph / mahotas: extensive morphological algorithm support

Examples

Basics

  • Loading images
  • Numpy matrix operations
  • Viewing images

Edge Detection

  • Concerned with identifying strong edges in an image
  • Useful for identifying
    • dominant directions
    • horizon lines
    • general shape

Morphological Operations

  • Based on mathematical morphology
  • Typically use a special "structuring element" shape
  • Useful for a variety of basic image processing
  • Typically fast enough for real-time use

Learning

  • ML Classifier: consume a lot of data points in training
  • Classify new data by matching it to what was learned during training
  • When in doubt, use a Support Vector Machine

Face Detection

  • Built into OpenCV
  • One technique uses HAAR Cascade
    • Based on HAAR wavelets
    • Introduced by Alfréd Haar in 1909
  • Uses integral images: fast
  • HAAR features arranged into relationships (eyes in relation to nose, in relation to mouth, in relation to...)
  • Requires a trained "cascade" for classification
  • Many trained cascades available for face detection

Gotchas

  • Types: uint vs float
  • Library glue
  • Installation
  • Robustness
  • Algorithm complexity

?