Professional Documents
Culture Documents
AIMS TO PROJECT 3D TV
Tokyo - Imagine watching a football match on a TV that not only shows the players in three
dimensions but also lets you experience the smells of the stadium and maybe even pat a goal
scorer on the back.
Japan plans to make this futuristic television a commercial reality by 2020as part of a
broad national project that will bring together researchers from the government, technology
companies and academia.
The targeted "virtual reality" television would allow people to view high definition
images in 3D from any angle, in addition to being able to touch and smell the objects being
projected upwards from a screen to the floor.
"Can you imagine hovering over your TV to watch Japan versus Brazil in the finals of
the World Cup as if you are really there?" asked Yoshiaki Takeuchi, development at Japan's
Ministry of Internal Affairs and Communications.
While companies, universities and research institutes around the world have made
some progress on reproducing 3D images suitable for TV, developing the technologies to
create the sensations of touch and smell could prove the most challenging, Takeuchi said in
an interview with Reuters.
Researchers are looking into ultrasound, electric stimulation and wind pressure as
potential technologies for touch.
Such a TV would have a wide range of potential uses. It could be used in home-
shopping programs, allowing viewers to "feel" a handbag before placing their order, or in the
medical industry, enabling doctors to view or even perform simulated surgery on 3D images
of someone's heart.
The future TV is part of a larger national project under which Japan aims to promote
"universal communication," a concept whereby information is shared smoothly and
intelligently regardless of location or language.
[1]
Takeuchi said an open forum covering a broad range of technologies related to
universal communication, such as language translation and advanced Web search techniques,
could be established by the end of this year.
Researchers from several top firms including Matsushita Electric Industrial Co. Ltd.
and Sony Corp. are members of a report on the project last month.
The ministry plans to request a budget of more than 1 billion yen to help fund the project in
the next fiscal year starting in April 2006
[2]
CHAPTER-2
INTRODUCTION
Three-dimensional TV is expected to be the next revolution in the TV history. They
implemented a 3D TV prototype system with real-time acquisition transmission, & 3D
display of dynamic scenes. They developed a distributed scalable architecture to manage the
high computation & bandwidth demands. 3D display shows high-resolution stereoscopic
color images for multiple viewpoints without special glasses. This is first real time end-to-end
3D TV system with enough views & resolution to provide a truly immersive 3D experience.
2.1 Why 3D TV
The evolution of visual media such as cinema and television is one of the major
hallmarks of our modern civilization. In many ways, these visual media now define our
modern life style. Many of us are curious: what is our life style going to be in a few years?
What kind of films and television are we going to see? Although cinema and television both
evolved over decades, there were stages, which, in fact, were once seen as revolutions:
1) at first, films were silent, then sound was added;
2) cinema and television were initially black-and-white, then color was introduced;
3) computer imaging and digital special effects have been the latest major novelty.
So the question is: what is the next revolution in cinema and television going to be?
If we look at these stages precisely, we can notice that all types of visual media have
been evolving closer to the way we see things in real life. Sound, colors and computer
graphics brought a good part of it, but in real life we constantly see objects around us at close
range, we sense their location in space, we see them from different angles as we change
position. This has not been possible in ordinary cinema. Movie images lack true
dimensionality and limit our sense that what we are being seeing is real.
Nearly a century ago, in the 1920s, the great film director Sergei Eisenstein said that
the future of cinematography was the 3d motion pictures. Many other cinema pioneers
thought in the same way. Even the Lumière brothers experimented with three-dimensional
(stereoscopic) images using two films painted in red and blue (or green) colors and projected
simultaneously onto the screen. Viewers saw stereoscopic images through glasses, painted in
the opposite colors. But the resulting image was black-and-white, like in the first feature
stereoscopic film "Power of Love" (1922, USA, Dir. H. Fairhal).
[3]
CHAPTER-3
Basics of 3D TV
Human gains three-dimensional information from variety of cues. Two of the most
important ones are binocular parallax & motion parallax.
[4]
Fig.3.1 Depth Perception
As shown in the figure, each eye captures its own view and the two separate images
are sent on to the brain for processing. When the two images arrive simultaneously in the
back of the brain, they are united into one picture. The mind combines the two images by
matching up the similarities and adding in the small differences. The small differences
between the two images add up to a big difference in the final picture ! The combined image
is more than the sum of its parts. It is a three-dimensional stereo picture.
The word "stereo" comes from the Greek word "stereos" which means firm or solid.
With stereovision you see an object as solid in three spatial dimensions-width, height and
depth--or x, y and z. It is the added perception of the depth dimension that makes stereovision
so rich and special.
[5]
Fig.3.2 Stereoscopic Images
As you can see, a stereoscopic image is composed of a right perspective frame and a left
perspective frame - one for each eye.
When your right eye views the right frame and the left frame is viewed by your left
eye, your brain will perceive a true 3D view.
Fig.3.3 Stereoscopes
3.2.3 Stereoscope
It is an optical device for creating stereoscopic (or three dimensional) effects from flat
(two-dimensional) images; D.Brewster first constructed the stereoscope in 1844. It is
provided with lenses, under which two equal images are placed, so that one is viewed with
the right eye and the other with the left. Observed at the same time, the two images merge
[6]
into a single virtual image, which, as a consequence of our binocular vision, appears to be
three-dimensional.
For those wondering what "stereoscopic" is all about, viewing stereoscopic images
gives an enhanced depth perception. This is similar to the depth perception we get in real life,
the same effect IMAX 3D and many computer games now provide.
[7]
CHAPTER-4
1. Distributed architecture
2. Scalability
3. Multiview video rendering
4. High-resolution 3D display
5. Computational alignment for 3D display
Typical scene models are per-pixel depth maps, the visual hull, or a prior model of the
acquired objects, such as human body shapes as shown in the figure 4.
[8]
Fig.4.1 Interpolations
It has been shown that even coarse scene models improve the image quality during view
synthesis. It is possible to achieve very high image quality with two layer image
representation that includes automatically extracted boundary mattes near depth penetration.
The Blue-C system consists of a room-sized environment with real-time capture & spatially
immersive display. All 3D video systems provide the ability to interactively control the
viewpoint, the feature that has been termed free viewpoint video by the MPEG Ad-Hoc
Group on 3D Audio 8 Video (3DAV). Real-time acquisition of scene models for general,
real-world scenes is very difficult. Many systems do not provide real-time end-to end
performance, and if they do they are limited to simple scenes with only a handful of objects.
Using a dense light field representation that does not require a scene model but on the other
hand, dense light field require more storage 8 transmission bandwidth. So, related to this light
field systems is our next topic.
[9]
dynamic light fields has only recently become feasible. Some system uses a bundle of optical
fibers in front of high definition camera to capture multiple views simultaneously. The
problem with the single camera is that the limited resolution of the camera greatly reduces the
number & resolution of the acquired views. Dense array of synchronized cameras will give
high resolution light fields. These cameras are connected with the cluster of PCs. Camera
array consists of up to 128 cameras & special purpose hardware to compress & store all the
video data in real-time. Most light field cameras allow interactive navigation & manipulation
of the dynamic scene. Now, let's move on to the architecture of the 3D TV.
[10]
CHAPTER-5
ARCHITECTURE OF 3D TV
Fig.5.1 3D TV System
1. Acquisition
2. Transmission
3. Display Unit
The system consists mostly of commodity components that are readily available
today. Note that the overall architecture of system accommodates different display types.
Let's understand the three blocks one after another.
5.1 Acquisition
[11]
As explain above each camera captures progressive high-definition video in real time.
Generally they are using 16 Basler A101fc color cameras with 1300X1030, 8 bits per pixel
CCD sensors. The question might be arising in your mind that what are CCD image sensors
& MPEG coding?
[12]
5.1.2 MPEG-2 Encoding
Now, the cameras are connected by IEEE-1394 High Performance Serial Bus to the
producer PCs. The maximum transmitted frame rate at full resolution is 12 frames per
seconds. Two cameras each are connected to one of the eight producer PCs. All PCs in this
prototype have 3 GHz Pentium 4 Processors, 2 GB of RAM, & run Windows XP.
They chose the Basler cameras primarily because it has an external trigger that allows
for complete control over the video timing. They have built a PCI card with custom
programmable logic device (CPLD) that generates the synchronization signal for all the
cameras. So, what is PCI card?
The power and speed of computer components has increased at a steady rate since
desktop computers were first developed decades ago. Software makers create new
applications capable of utilizing the latest advances in processor speed and hard drive
capacity, while hardware makers' rush to improve components and design new technologies
to keep up with the demands of high end software.
[13]
Fig.5.3 PCI Card
There's one element, however, that often escapes notice - the bus. Essentially, a bus is
a channel or path between the components in a computer. Having a high-speed bus is as
important as having a good transmission in a car. If you have a 700-horsepower engine
combined with a cheap transmission, you can't get all that power to the road. There are many
different types of buses. In this article, you will learn about some of those buses. We will
concentrate on the bus known as the Peripheral Component Interconnect (PCI). We'll talk
about what PCI is, how it operates and how it is used, and we'll look into the future of bus
technology.
All 16 cameras are individually connected to the card, which is plugged into the one
of the producer PCs. Although it is possible to use software synchronization, they consider
precise hardware synchronization essential for dynamic scenes. Note that the price of the
acquisition cameras can be high, since they will be mostly used in TV studios.
They arranged the 16 cameras in regularly spaced linear array. See the figure 8.
5.2 Transmission
This strategy has other advantages. Existing broadband protocols & compression
standards do not need to be changed for immediate real world 3D TV experiments. This
system can plug into today's digital TV broadcast infrastructure & co-exist in perfect
harmony with 2D TV.
There did not have access to digital broadcast equipment, they implemented the
modified architecture as shown in figure 9.
[15]
Fig.5.5 Modified System
Eight producer PCs are connected by gigabit Ethernet to eight consumers PCs. Video
stream at full camera resolution (1300*103D) are encoded with MPEG-2 & immediately
decoded on the producer PCs. This essentially corresponds to a broadband network with
infinite bandwidth & almost zeros delay. The gigabit Ethernet provides all-to-all connectivity
between decoders & consumers, which is important for distributed rendering & display
implementation. So, what is gigabit Ethernet? '
The receiver side is responsible for generating the appropriate images to be displayed.
The system needs to be able to provide all possible views to the end users at every instance.
The decoder receives a compressed video stream, decode it, and store the current
[16]
uncompressed source frame in a buffer as shown in figure 10. Each consumer has virtual
video buffer (VVD) with data from all current source frames. (I.e., all acquired views at a
particular time instance).
The consumer then generates a complete output image by processing image pixels
from multiple frames in the VVB. Due to the bandwidth 8 processing limitations it would be
impossible for each consumer to receive the complete source of frames from all the decoders.
This would also limit the scalability of the system.
Here is one-to-one mapping between cameras & projectors. But it is not very flexible.
For example, the cameras need to be equally spaced, which is hard to achieve in practice.
Moreover, this method cannot handle the case when the number of cameras & projectors is
not same.
Another, more flexible approach is to use image-based rendering to synchronize
views at the correct virtual camera positions. They are using unstructured lurnigraph
rendering on the consumer side. They choose the plane that is roughly in the center of the
depth of field. The virtual viewpoints for the projected images are chosen at even spacing.
Now focus on the processing for one particular consumer, i.e., one particular view. For each
pixel o (u, v) in the output image, the display controller can determine the view number v&
the position (x, y) of each source pixel s (v, x, y) that contributes to it.
To generate output views from incoming video streams, each output pixel is a linear
combination of k source pixels:
0 (u, v) Σ wts (v, x, y) ............ (1)
[17]
The blending weights w can be pre-computed by the controller based on the virtual
view information. The controller sends the position (x, y) of the k source pixels to each
decoder v for pixel selection. The index c of the requesting consumer is sent to the decoder
for pixel routing from decoders to the consumer. Optionally, multiple pixels can be buffered
in to the decoder for pixel block compression before being sent over the network. The
consumer decompresses the pixel blocks & stores each pixel in VVB number v at position (x,
y). Each output pixel requires from k source frames. That means that the maximum
bandwidth on the network to the VVB is k times the size of the output image times the
number of frames per second (fps). This can be substantially reduced if pixel block
compression is used, at the expense of more processing. So to provide scalability it is
important that this bandwidth is independent of the total number of the transmitted views. .
The processing requirements in the consumer are extremely simple. It needs to compute
equation (1) for each output pixel. The weights are pre computed & stored in a lookup table.
The memory requirements are k times the size of the output image. Assuming simple pixel
block compression, consumers can easily be implemented in hardware. That means decoders,
networks, & consumers could be combined on the one printed circuit board. Let's move on to
the different types of display.
[18]
CHAPTER-6
It is widely acknowledged that Dennis Gabor invented the hologram in 1948. he was
working on an electron microscope. He coined the word and received a Nobel Prize for
inventing holography in 1971. The holographic image is true three-dimensional: it can be
viewed in different angles without glasses. This innovation could be a new revolution – a new
era of holographic cinema and of holographic media in whole.
Holographic techniques were first applied to image display by Leith & Upatnieks in 1962.
In holographic reproduction, interference fringes on the holographic surface to reconstruct the
light wave front of the original object diffract light from illumination source. A hologram
displays a continuous analog field has long been considered the “holy grail “of 3D TV. Most
recent device, the Mark-2 Holographic Video Display, uses acousto-optic modulators, beam
splitters, moving mirrors & lenses to create interactive holograms. In more recent systems,
moving parts have been eliminated by replacing the acousto-optic modulators with LCD,
focused light arrays, and optically addressed spatial modulators, digital micro mirror devices.
Figure shows the holographic image.
[19]
All current holo-video devices use single-color laser light. To reduce the amount of
display data they provide only horizontal parallax. The display hardware is very large in
relation to size of the image. So cannot be done in real-time.
We have developed the world's first holographic equipment with the capability of
projecting genuine 3-dimensional holographic films as well as holographic slides and real
objects – for the multiple viewers simultaneously. Our Holographic Technology was
primarily designed for cinema. However it has many uses in advertising and show business as
well.
At the same time we have developed a new 3d digital image processing and projecting
technology. It can be used for creation the modern 3d digital movie theaters and for the
computer modeling of 3d virtual realities as well. On the same principle we have already
tested a system 3d color TV. In all cases audience can see colorful 3-d inconvenient
accessories.
Developed in the Holographic Laboratories of Professor Victor Komar (NIKFI), these
technologies have received worldwide recognition, including an Oscar for Technical
Achievement in Hollywood, a Nika Film Award in Moscow, endorsement from MIT's Media
Lab and many others.
On this website you can find general information about our technology, projects, brief
history of 3d and holographic cinema, investment opportunities and sales. For more specific
questions please check FAQ section on the ENQUIRE page. You can also send us a message
via email: the addresses are on the CONTACT page. We have developed the world's first
holographic equipment the genuine 3-dimensional holographic films as well as holographic
slides and real objects – for the multiple viewers. Our Holographic Technology was primarily
designed for cinema. However it has many uses in advertising and show business as well.
[20]
their limited color reproduction & lack of occlusions. The design of large size volumetric
displays also poses some difficult obstacles.
[21]
Fig.6.2 Images of a scene from the viewer side of the display (top row) and
as seen from some of the cameras (bottom row).
[22]
CHAPTER-7
3D DISPLAY
This is a brief explanation that we hope sorts out some of the confusion about the many
3D display options that are available today. We'll tell you how they work, and what the
relative tradeoffs of each technique are. Those of you that are just interested in comparing
different Liquid Crystal Shutter glasses techniques can skip to the section at the end.
Of course, we are always happy to answer your questions personally, and point you to other
leading experts in the field.
Figure shows a diagram of the multi-projector 3D displays with lenticular sheets.
They use 16 NEC LT-170 projectors with 1024'768 native output resolution. This is
less that the resolution of acquired & transmitted video, which has 1300'1030 pixels.
However, HDTV projectors are much more expensive than commodity projectors.
Commodity projector is a compact form factor. Out of eight consumer PCs one is dedicated
as the controller. The consumers are identical to the producers except for a dual-output
graphics card that is connected to two projectors. The graphic card is used only as an output
device.
For real-projection system as shown in the figure, two lenticular sheets are mounted
back-to-back with optical diffuser material in the center. The front projection system uses
only one lenticular sheet with a retro reflective front projection screen material from flexible
fabric mounted on the back. Photographs show the rear and front projection.
[23]
Fig.7.2 Rear Projection and Front Projection
Interest in 3D has never been greater. The amount of research and development on 3D
photographic, motion picture and television systems is staggering. Over 1000 patent
applications have been filed in these areas in the last ten years. There are also hundreds of
technical papers and many unpublished projects.
I have worked with numerous systems for 3D video and 3D graphics over the last 20
years and have years developed and marketed many products. In order to give some historical
[24]
perspective I’ll start with an account of my 1985 visit to Exposition 85 in Tsukuba, Japan, I
spent a month in Japan visiting with 3D researchers and attending the many 3D exhibits at the
Tsukuba Science Exposition. The exposition was one of the major film and video events of
the century, with a good chunk of its 2 1/2 billion dollar cost devoted to state of the art
audiovisual systems in more than 25 pavilions. There was the world’s largest IMAX screen,
Cinema-U (a Japanese version of IMAX), OMNIMAX (a dome projection version of IMAX
using fisheye lenses) in 3D, numerous 5, 8 and 10 perforation 70mm systems - several with
fisheye lens projection onto domes and one in 3D, single, double and triple 8 perforation
35mm systems, live high definition (1125 line) TV viewed on HDTV sets and HDTV video
projectors (and played on HDTV video discs and VTR’s), and giant outdoor video screens
culminating in Sony’s 30 meter diagonal Jumbotron (also presented in 3D). Included in the
3D feast at the exposition were four 3D movie systems, two 3DTV systems (one without
glasses), a 3D slide show, a Pulfrich demonstration (synthetic 3D created by a dark filter in
front of one eye), about 100 holograms of every type, size and quality (the Russian’s were
best), and 3D slide sets, lenticular prints and embossed holograms for purchase. Most of the
technology, from a robot that read music and played the piano to the world’s largest tomato
plant, was developed in Japan in the two years before the exposition, but most of the 3D
hardware and software was the result of collaboration between California and Japan. It was
the chance of a lifetime to compare practically all of the state of the art 2D and 3D motion
picture and video systems, tweaked to perfection and running 12 hours a day, seven days a
week. After describing the systems at Tsukuba, I will survey some of the recent work
elsewhere in the world and suggest likely developments during the next decade.
[25]
CHAPTER-8
CONCLUSION
Most of the key ideas for 3D TV systems presented in this paper have been known for
decade, such as lenticular screens, multi projector 3D displays, and camera array for
acquisition. This system is the first to provide enough view points and enough pixels per view
points to produce an immersive and convincing 3D experience. Another area of future
research is to improve the optical characteristic of the 3D display computationally. This
concept is computational display. Another area of future research is precise color
reproduction of natural scenes on multiview display.
[26]
REFERENCES
2. T. Capin, K. Pulli, and T. Akenine-Möller, “The State of the Art in Mobile Graphics
Research”, IEEE Computer Graphics and Applications, vol. 28, no. 4, pp. 74 - 84,
2008.
3. K. Müller, P. Merkle, and T. Wiegand, “Compressing 3D Visual Content”, IEEE
Signal Processing Magazine, vol. 24, no. 6, pp. 58-65, November 2007.
4. T. Okoshi, "Three dimensional displays," Proceedings of the IEEE, vol. 68, pp. 548-
564, 1980.
5. I. Sexton, and P. Surman, “Stereoscopic and auto stereoscopic display systems,”
IEEE Signal Processing Magazine, vol. 16, no. 3, pp. 85-99, 1999.
6. P C. Fehn, P. Kauff, M. Op De Beeck, F. Ernst, W. IJsselsteijn, M. Pollefeys, L. Van
Gool, E. Ofek and I. Sexton, “An Evolutionary and Optimized Approach on 3D-TV”,
Proc. of International Broadcast Conference, 2002.
7. C. Fehn, “A 3D-TV approach using depth image- based rendering (DIBR)”, Proc. Of
VIIP 2003.
8. D. Florencio and C. Zhang, “Multiview video Compression and Streaming Based on
Predicted Viewer Position”, Proc. ICASSP 2009.
9. P. Merkle, A. Smolic, K. Müller, and T. Wiegand, “Multi-view Video plus Depth
Representation and Coding”, Proc. IEEE International Conference on Image
Processing (ICIP'07), San Antonio, TX, USA, pp. 201- 204, Sept. 2007.
10. A. Nurminen, “Mobile 3D City Maps”, IEEE Computer Graphics and Applications,
vol. 28, no. 4, pp. 20-31, 2008.
[27]