You are on page 1of 50

Computer Vision as universal controller for interactive media

Denis Perevalov
perevalovds@gmail.com
video: www.youtube.com/perevalovds#g/u
lections: www.uralvision.blogspot.com
Institute of Mathematics and Mechanics, Ural Branch of Russian Academy of Sciences

http://www.instablogsimages.com/images/2008/01/04/aperture-interactive-display_48.jpg
Contents

1. What is controller
2. Computer vision as universal controller
3. Video cameras for computer vision
4. Controllers based on computer vision I
5. Controllers based on computer vision II
6. Programming technologies
7. The prospects
1. What is controller
Definition of controller
Controller is any sensor, any source of data, which can be converted into digital
signal.

Proximity sensor
Standard musical controllers

Midi keyboard Midi pad

Standard instruments
Midi track control Breath controller with MIDI output

Actually, these are musical instruments or tools for intruments, which output digidal data
about the sound instead of sound.
Controllers of human motion

Multitouch

Motion Capture

Biosensors
http://neurocenter.unige.ch/groups/pun.php
Controllers of physical phenomenas

Waves on the water

Aleatoric water musical instrument


http://www.youtube.com/watch?v=CZ_KijiwQHE
Stochastical controllers

Insects trajectories Internet, data from


market
"Debug - art by insects"
http://vimeo.com/12645870
2. Computer vision as
universal controller
What is computer vision

Computer vision is a branch of Computer Sciences, which study


automatic image analysis, with images received from digital video and
photo cameras.
What is computer vision
Examples of tasks, which can be solved using computer vision:

Analysis of motion
Segmentation
(optical flow)
What is computer vision

Examples of tasks, which can be solved using computer vision:

Object tracking and Motion Capture


measurement

http://armi.kaist.ac.kr/korean/UserFiles/File/MMPC.JPG
Computer vision as universal controller

Any parameters of the physical processes, which is realized in mechanical


motion, shape or color or transparency change
- can be digitized using computer vision.
Computer vision as universal controller

Difference of computer vision from other sensor types:

1) possibility to use huge amount of data (pixel array).

2) possibility to structurize this data, by extracting from the


image the needed parameters (for example, position and size
of the objects).
Limits of the controllers based on
computer vision

1. Because of huge amount of data, such controllers normally


work with big latency.

2. Because of cameras see objects at a distance, a precision


of measuremens can be not enough.

3. The light is needed for cameras. (Visible or infrared, but


needed anyway).

(We'll see it in more details below)


3. Video cameras for
computer vision
Camera's basic characteristics
For tasks of image analysis in realtime
different kinds of cameras can be needed.

The basic characteristics are:

1. Resolution

2. Number of frames per second

3. Type of output data (visible, IR, 3d)


Resolution
This is dimensions of the camera image in pixels.

320 x 240 640 x 480 1280 x 1024


precision for object size 1
meter:
3.13 мм
1.56 мм 0.97 мм
size of 30 frames:
6.6 Mb
26.4 Мб 112.5 Мб

http://www.mtlru.com/images/klubnik1.jpg
Number of frame per second

30 FPS 60 FPS 150 FPS


time between two frames 6 msec
33 msec 16 msec Can be used for musical
instrument

http://www.youtube.com/watch?v=7iEvQIvbn8o
Type of output data

Color or graytone image Infrared image Color image + depth


of visible light information
Using invisible IR-light, such
camera wil see during dark
performance conditions
Examples
Sony PS3 Eye

320 x 240 : 150 FPS


640 x 480 : 60 FPS

Data type:
visible light,
IR (you need to remove IR-filter)

Price: 50$.

USB, CCD
Examples
Point Grey Flea3
648 x 488 : 120 FPS

Data type:
- visible light,
- IR (?)

Price: 600$.

Model FL3-FW-03S1C-C
IEEE 1394b, CCD
Examples
Microsoft Kinect
640 x 480 : 30 FPS

Data type:
color image + depth

Price: 150$.
(depth is received using special IR pattern,
so will not work in direct sunlight)
USB, CMOS
Examples
Point Grey BumbleBee2
640 x 480 : 48 FPS

Data type:
color image + depth

Price: 2000$.

(Depth is receiving using stereovision from two cameras)


IEEE 1394b, CCD
4. Controllers based on
computer vision I
Sliders and buttons - by camera

- Camera can read slider position without electronics.


- Your finger can be slider or button, camera will read its
position.
Additive synthesizer using optical measurement of
brightness
Using two color markers for creating virtual picture
Giroscope using camera: AR

Augmented Reality with Markers


6-dimensions: X, Y, Z, 3 axes on rotation
http://www.edhv.nl/edhv/wp-content/uploads/2009/12/aug_Picture-10_no-border-450x337.jpg
Multitouch
FTIR multitouch

http://www.touchuserinterface.com/2010/02/lcd-multi-touch-using-inverted-ftir.html
http://sites.google.com/site/ideolabsdocumentation/images/multitouchdiagram.png
Virtual reality gloves
Color gloves using one webcamera only, in MIT (prototype)

http://www.csail.mit.edu/videoarchive/research/gv/hand-tracking
Motion Capture

Microsoft Kinect for XBox


(Motion capture is available only for XBox developers yet. But using OpenKinect library in
Mac/Windows you can get color and depth images - it is really good for many types of
projects. For official opinion of Microsoft on OpenKinect see http://www.thinq.co.
uk/2010/11/22/microsoft-declares-openkinect-safe/)
Conclusion

Not-camera sensors are more precise and rapid.

Computer vision merits:


- fast implementstion быстрота реализации for prototype
building
- universal (one camera can be used for differents kinds of
sensors)
5. Controllers based on
coputer vision II
Motion areas detection

Result is coordinates of the areas with motion.


Optical flow calculation

Result is 2D-vector field of motion vectors.


Objects of interest detection

Result is coordinates and sizes of detected objects.


Non-visual structures on the image

- Sum of brightness of some special pixels on the image.

- Vector of special components of image Fast Fourier Transform.

Such characteristics are "nonvisual", because have no direct relation


with areas on the image or some objects or its characteristics. But
these are non-random and sometimes are useful.
Conclusion

Implementetion of such controllers using not-camera sensors is


expensive, and sometimes specific device constructing is
needed.
6. Programming technologies
Low-level libraries

"Open Computer Vision Library"

Open library for image analysis, C/C++.


Low-level libraries

"Open Graphics Library"

Open library for radip graphics, C/С++.


Low-level libraries

"Open Computing Language"

Open library for parallel computing, with GPU too, C/С++.


Allows to speedup computings very much. OpenCV + OpenCL is preparinin
development stage now (by Intel).
Middle-level platforms
"Creative coding" platforms with large amount of functions for
convenient programming.

openFrameworks Processing Cinder


C/C++ Java C/C++
is too slow for computer is young, but popularity is
vision increasing
High-level platforms
Visual programming platforms for musucians and designers.
Almost no text programming needed. If needed, it can be
extended by custom plugins, developed by programmers in
lower-level platforms.

Max/MSP/Jitter VVVV Quest3D


Audio-oriented Video effects - High quality 3D -
oriented oriented
The prospects
Technological prospects

- Manufacturing more models of cameras with FPS > 100 for


using it in live performance.

- Increasing resolution and processing speed for increasing


measurement precise in space and time scales.
Agorithmical prospects

http://susiemander.files.wordpress.com/2010/10/facial-expression.jpg

- Stable emotions and mimics recognition.

- 3D scenes with many objects recognition (using the cameras


with depth).
Ideas prospects

1. Mutltitouch is not used as it should to be.

2. How to use moton capture technology for effective


performance?
Ideas prospects

- Searching new and unusual processes, which can be visible


by camera.

- Searching new interesting structures in images and ordinal


phenomenas.

You might also like