Professional Documents
Culture Documents
131
Chung, Shih-Kai
Associate Professor, Dept. of Multimedia and Animation Arts, National Taiwan University of Arts
Abstract
Creating realistic human face models and facial animations has been a persistent problem in Computer Animation. However, being the utmost channel in reflecting the characters emotion, facial animation has drawn researchers interest for years. This paper surveys the animation techniques developed in Computer Graphics for facial animation. It summarizes the theoretical approaches used in published work and describes their strengths, weaknesses, and relative performance. First, the problems and current status of facial animation are described. Then, we review a variety of geometric, image, performance-driven techniques of facial animation. Artistic expression control is especially emphasized in evaluating each method. Finally, summary of these facial computational models in achieving artistic expression are also offered as the conclusions.
Key words: Facial expression, Animation principles, Character animation, Motion control, Artistic expression control.
132
Introduction
One of the most appealing attractions in drama is the vivid characters personality conveyed through the talented actors impersonation. This feature is particularly significant in conventional animation, as the characters in it have virtually no physical limitation on them, thus have more freedom to perform their acts. This somehow explains why the cartoon animation is consistently called character animation. The success of a character animation relies on two crucial parts, realistic animation of body movement and facial expression. Motion control of characters movement has been always the mainstream research field in computer animation. Conversely and surprisingly, the quality and quantity of study on facial expression animation is small in proportion to the body animation. Realistic animation of 3D character face represents one of the most difficult challenges in computer animation. The difficulties mainly come from three problems: 1. The representation of the face model. It is extremely difficult to model a face with a detailed geometry and natural-looking skin. 2. Motion control of facial expression. Facial expression involves psychology and physiology. Adding the complex interactions between the bone and the muscle, make realistic facial animation exceptionally difficult. 3. The sensitivity of viewers. As facial expression is one of the major ways we communicate with each other, people are very sensitive at the details of facial expressions. Even a subtle change in facial expression can strongly draw the viewers attention. For the first problem, detailed geometry of face model can be fairly accomplished with advanced digitizing technology or animators specification. However, the representation of natural skin still presents an important challenge which needs to be overcome, as it will deeply impact the realistic animation of facial expression. Contrary to the comprehensive representation of faces appeared in realistic character animation, cartoon characters mostly set with a far less complicated model, and instead of using delicate texturing methods, enhanced shading techniques usually are applied for the animation of skin color. For the second problem - motion control, facial expression can be very complex and subtle. The movements are mainly driven by the squash and stretch from the underneath muscles, and sometimes may involve both the inside bone actions and the outside skin color change. Thus motion control of these movements can be very difficult. Because of the relatively simple geometry model
133
and shading technique are employed, realistic movements are rarely the primary goal in cartoon character animation. As a matter of fact, in order to generate the drama charisma, exaggeration is a common approach to emphasize the characteristics of character. Under such a circumstance, the comparatively simple representation of the face model, interestingly, gives an advantage at animating the facial expressions, as it is more about acting than realism. The last but not least problem is the sensitivity of the viewers on facial expressions. Being the main window to show a persons psychological and physiological states, we see ourselves and each others facial expression in our daily life, and are very sensitive to facial expression. Any unnatural motion detail of facial expression can be easily spotted out by the viewers, even though isolating the factors of the incorrect movement is often difficult. This sensitivity of observation will further multiply the difficulty in animating characters facial expression, especially for those with high resolution, realistic face models. For the cartoon character, since relatively simple models are used, the sensitivity of facial expressions although still there, the tolerance of erroneous movements is considerably raised. This might explain why the cartoon figures are still employed in most applications of computer animation, even for those, high budget and long production time, commercial films. Like traditional hand-drawn 2D character animation, which always has lifelike facial animation in it to help tell the story, the success of 3D character animation also heavily relies on natural animation of facial expressions. However, the true spatial perception provided by three dimensions also raises the standard of visual demand for each aspect in appearance. Thus, makes the production of 3D character animation more difficult than its 2D counterpart. This paper sets to deduce the guidelines of facial expression designing for 3D cartoon characters. The research goal is important as the facial expression plays a significant part in character animation which is the most popular category in animation. The problems that need to be solved are difficult as the interaction between psychological states and physiological movements of facial muscles are still not well understood, and published works concerning motion design of facial expression are rare. This paper surveys mainstream motion control techniques in computer animation, focusing on the solution each method brings to the animation of facial expression. Since several fields have different facial models and research interest, rather than provides a general review of facial animation methods, this study uses an important animation criteria in motion control to survey these animation models artistic expression control. The succeeding sections proceed as follows: section 2 introduces facial expression and animation principles; section 3 reviews animation of facial
134
expression and related animation techniques; section 4 scrutinizes motion design and related animation principles regarding the animation of facial expression; finally, the last section summaries motion design of facial expression and the possible future studies are offered as the conclusions.
Facial Expressions
Facial expressions are critical in character animation, because they reflects emotions and are often thought of as projections or read out of a persons mental state (Baron-Cohen et al., 2001). That is, they describe the character and help tell the story. A face in motion is a complicated animated structure in which fine details, rigid and deformable structures run together to produce a variety of expressions. All human faces possess the same physiological organization of bones, muscles and skin. People smile at the same way - the corners of the mouth lift up through contraction a muscle called zygomaticus major, and the eyes crinkle through contraction of the orbicularis oculi muscle. Though all faces are different, people make facial expressions through the same mechanism. Ekman and Friesen detailed which muscles move during which facial expressions. In their historic work, the Facial Action Coding System (or FACS; Ekman & Friesen, 1978), 46 distinct action units (AUs) are identified to describe all visually distinguishable facial activities. FACS assigns each muscle movement an "action unit" number, so a smile is described as AU12--representing an
135
uplifted mouth--plus AU6--representing crinkled eyes. FACS coding procedures not only allow for coding of the intensity and timing of each facial action, but also for the coding of facial expressions in terms of events. An event is the AU-based description of each facial expression, which may consist of a single AU or many AUs contracted as a single expression. In other words, any facial expression can be created by either scaling or/and combining the action units. The comprehensive anatomical studies and parameterized characteristics of action units make FACS easy to integrate into different motion control approaches. Unsurprisingly, most facial modeling systems, nowadays, describe facial actions based on FACS.
Animation Principles
Facial animation requires the artist to interpret and create something that speaks out the characters emotion or personality. Psychological researches (Duchenne, 1990; Stein, 1992; Ekman, 1993) have shown that facial expression is one of the best tools for measuring emotion. Since character animation is about an artist bringing a character to life, besides body motion, facial animation is the best way to win the audiences consent about the characters. Though initially developed for improving animators drawing skill for moving human figures and animals 70 years ago, Disneys twelve Animation Principles (Thomas & Johnston, 1981) today still stand for the best tool in examining character animation. They include: 1. Squash and Stretch; 2.Timing; 3. Anticipation; 4. Exaggeration; 5. Slow In and Out (or Ease In and Out); 6 Arcs;h; 7. Secondary Action; 8. Follow Through and Overlapping Action; 9. Staging; 10. Straight Ahead Action and Pose-To-Pose Action; 11. Solid Drawing; 12. Appeal. Among these principles, other than Stagingand Solid Drawing, all are directly related to motion control of the characters. These principles, though, developed and designed to be used as a fundamental reference to nature for conventional hand-drawn 2D animation. Director of Toy
Story, the first ever 3D animation feature film, John Lasseter (Lasseter, 1987), demonstrated that the application of Disneys Animation Principles can be flourishingly transported into 3D computer animation. Based on these principles, Lasseter (Lasseter, 1994) further addressed another important guideline in animating character, The Thinking Character. Unlike Disneys Animation Principles, this principle is more about motion designing, especially facial animation. A nature motion would have the character thought before it acted, so the character wouldnt move like a marionette controlled by the animator. Thus, the best way to convey the idea that the character is thinking is
136
through the animation of facial expression. Compared to body motion, the motion scale of facial expression is much smaller. Nevertheless, more complex and subtle animation will be required in order to achieve convincible character animation. The above principles provide not only an essential reference of body animation guidelines, but also an excellent index for evaluation of facial animation. The following sections will survey mainstream motion control techniques on facial animation. For each different approach, the theory and mechanism of each will be briefly described firstly. Then, artistic expression control will be surveyed, using the above principles.
137
less realistic way. Textures enable complex variations of surface properties at each pixel, thus creating the appearance of face detail that is absent in the face models geometry. Consequently, textures are widely used to accomplish facial image realism. Oka et al. (Oka et al., 1987) used an interactive texture mapping system to simulate realistic facial expressions and animation. In their system, as the geometry of free-form surfaces or the view point change, new texture mapping, which based on their proposed approximation scheme of mapping functions, is applied to achieve optimal display. Since the mapping from the texture plane into the output screen is approximated by a linear function on each of the small regions which form the texture plane altogether, their algorithm is relatively simple and efficient to smooth surfaces, such as human face. For facial animation, interpolation and extrapolation among multiple 3D facial surfaces are implemented with the dynamic texture mapping onto them depending on the geometry or viewpoint changes. Using multiple photographs, Pighin et al. (Pighin et al., 1998) developed techniques for creating realistic textured 3D facial models from photographs of real humans, and for creating smooth transitions between different facial expressions by morphing between these different models (Figure 1). In their approaches, a scattered data interpolation technique is used to deform a generic face model to fit the particular face geometry of the real actor. To generate transitions between facial expressions, morphing and blending between the corresponding face models and textures are implemented. Using these techniques, their system is able to generate photorealistic face models and natural looking animations, on the condition that the lighting variation in the original texture photographs must be carefully constrained.
Figure 1. Blending between surprised (left) and sad (center) produces a worried expression (right). (Pighin et al., 1998)
138
Figure 2. Top row: morphing between two real-person images (left, right) creates a life-like in-between person (middle). Bottom row: shows the controlled lines drawn over the corresponding face. (Beier & Neely, 1992)
To overcome the limitations of 2D morphing, Chen et al. (Chen et al., 1995) applied Beier and Neelys method to morph between cylindrical laser scans of human heads. Pighin et al. (Pighin et al., 1997) combine 2D morphing with 3D transformations of a geometric model. They animate key
139
facial expressions with 3D geometric interpolation, while image morphing is performed between corresponding texture maps. Still, even in these cases, animations are limited to interpolations
between two predefined expressions. The animator must specify correspondences for every pair of expressions in order to produce a smooth transition between them.
140
Anticipation is the preparation of an action, Follow through and Overlapping Action is the termination of an action. Anticipation is extremely important in facial animation. As the other animation guideline Thinking Character - the character must think before it acts, is concerned, Anticipation in facial expression is the best way to exhibit its thinking characteristic. Though these two principles are critical in facial animation, they are also the principles that inexperienced animators will easily overlook. Unfortunately, it is difficult to use image manipulation approaches to meet these two principles. A better solution to this problem would require the animator to carefully insert extra source images in front and back of the main action. Thus they can be further manipulated to achieve artistic expression control of the facial animation. Timing and Slow in and Slow out are the two principles which directly correspond to the tempo of an action. Timing refers the time it takes to complete an action. For keyframing, this normally means the number of frames set between two key frames. Slow in and Slow out describes the physical fact that objects in the real word do not abruptly start or stop moving. There is always a certain degree of acceleration at the beginning, and deceleration at the end of the motion. Timing is often used to imply an objects size, weight and even personality. Adding Slow in and Slow out property, it further insures the natural movement. For computer animation, with the aid of computation hardware and software, Timing is the most convenient tool available for animators to improve their work; they decide the number of in-between frames and let the computer do the rest. For facial animation using image manipulation, some sort of interpolation has to be done to warrant the transition between the manipulated images. To preserve Slow in and Slow out property, cubic curve interpolation provides a good solution to insure the smooth transition between images. As both Timing and Slow in and Slow out affect the pace of action only, they can be easily adapted in image manipulation system to provide better artistic control of facial animation. Solid Drawing refers to those with interesting, well proportioned shapes and good sense of weight and volume. Appeal is something in the scene that will please the audience and grab their attention. Both principles can be found in various forms, for example, simplicity, clarity, pleasing in design, a quality of charm, charisma, etc., and are the basics of 3D animation. If the images being manipulated are captured from live performance, they uphold the very nature of facial expression; if they are manually produced, the animator will have more luxury in artistically creating them. Either way, both Solid Drawing and Appeal can be better preserved. For artistic expression control of
141
facial animation, these two principles are the major advantage of using image manipulation over the other approaches.
Interpolation
Interpolation is a process of generating a value based on its neighbors. Depending on interpolation approach, neighboring values usually contribute a certain weight to the value being interpolated. This weight is often inversely proportional to the distance at which the neighbor is located. Therefore, interpolation can provide a user-desired smooth transition between neighbors. Interpolation techniques offer not only a simple but intuitive approach to computer animation. In facial animation, typically, an interpolation function is used to specify smooth transition between two facial expressions over a normalized time interval. Interpolation can be performed in one-dimensional, two-dimensional, or three-dimensional space. Interpolation approaches applied in computer animation in general fall into two categories: linear interpolation and cubic interpolation. For simplicity, linear interpolation is commonly used to generate facial animation. The pioneering work from Parke (Parke, 1972) introduced simple geometric interpolation between face models to generate animated sequences of a human face changing expression (Figure 3). It proves that facial animations can be achieved through a simple face model with a polygonal skin containing approximately 300 polygons defined by about 400 vertices, and a cosine interpolation scheme to fill in the intermediate frames between expressions. With bilinear interpolation, Arai et al. (Arai et al., 1996) proposed a method for generating facial
142
animation in which facial expression and shape can be changed simultaneously in real time. In their model, facial expressions are superimposed onto the face shape through the 2D parameter space independents of the face shape. Hence, the expressions can be applied to various facial shapes. To create a variety of facial expression changes, the facial model is transformed by a bilinear interpolation, which enables a rapid change in facial expression with metamorphosis.
Parameterization
Compared to geometric interpolation which directly moves positions of the face mesh vertices, parameter interpolation controls functions that indirectly manipulate the vertices. Using theses parametric functions, the animator can create facial images by specifying the appropriate set of facial parameters. The pioneer parametric models (Parke, 1974) proposed by Parke are based on the concepts of simple parameterization and image synthesis. Parameterization includes choosing the relevant facial parameters, based on the exterior observation or on the underlying structures that cause the specific expression. These parameters are categorized into two classes: 10 control parameters for facial expression and about 20 parameters for defining facial conformation (Parke, 1982). Among the expression parameters, eyes are modeled by procedural animation to control the dilation of pupils, opening/closing of eyelids, position and shape of eyebrows, viewing direction. The mouth is animated by rotation, which controls the position of lips, mouth corners, and shape of the mouth. Also other additional parameters to control head rotation, shape of nostrils etc. For the conformation parameters, these are the parameters used to control the shape, color, proportions, and offset of facial features. According to Parke, there are five different operations how parameters determine facial geometry. Parameterization was applied onto to these expression and conformation
143
parameters to create facial animation. Further work by Parke and Walters continued reducing the parameters and uses what they call direct parameterization (Parke and Waters, 1996) to animate facial expressions. In their method, the face is still represented by polygonal meshes, but the animation can be simulated by a far smaller set of parameters, though the underlying animation approaches are still based on key-framing and interpolation. Compared to interpolation approaches, direct parameterization provides more intuitive ways to animate facial expressions. However, this method has its own problems. The set of parameters are not universal for all faces. That is, a set of parameters is bound to a certain facial topology. To create a different face, the animator will have to reset the parameters. Furthermore, when there is conflict between parameters, the result facial expression look unnatural. To avoid this undesired effect, parameters are set to only affect specific facial regions, however this often results noticeable motion discontinuity, especially in the region boundaries.
Mass-spring Systems
Mass-spring methods model the skin, muscles, and bones as a number of point masses connected by springs, like a cloth. These techniques produce muscle forces in an elastic spring mesh that simulates skin deformation. The early work from Platt and Badler (Platt and Badler, 1981) proposed a model that is based on underlying facial structure. A set of interconnected three-dimensional networks of points are used to simulate points on the skin, muscles, and bones (Figure 4). The skin is the outermost layer, which can be viewed as a two-dimensional surface, represented by points with three-dimensional coordinates. Whereas the bones represent internal layer that can not be moved. Between both layers, muscles are groups of points with elastic arcs that relate the bones and skin. When a force or tension is applied to a certain portion of the point network, the force/tension propagates outward, affecting more and more distant sections of the face model, and causes the skin deformation.
144
Muscle Skin
Bone
Vector Representations
The vector techniques deform a facial mesh using motion fields in the defined regions of influence. The model proposed by Waters (Waters, 1987) is particularly noteworthy. In his model, muscles are represented by muscle vectors, which describe the effect of muscle contraction on the topology of the face (Figure 5). Based on generic anatomical structure of human face, most facial muscles have a bony attachment that remains static, while the other end is embedded in the soft tissue of the skin. When the muscle is operated, it contracts isotonically. Thus two types of vectors are created: linear/parallel muscles that contract longitudinally towards their origin and sphincter muscles that squeeze radially towards a center of point (Figure 6). Furthermore, an additional type of muscle, sheet muscle, can be modeled by composing of several linear muscles side-by-side. Waterss model does not depend on the bone structure, which enable it to be ported to diverse facial topologies. Using this muscle model combining with parameterized techniques, Waters was able to animate basic human emotions such as anger, fear, surprise, disgust, joy, and happiness. Today, most physically-based models are still built using Waterss basic principles.
Figure 5. An implementation of Waters' muscle model showing a smiling expression. The black line segments in the left figure indicate the directions of muscle contraction.
145
Figure 6. Waterss muscle model. (left): linear muscle. (right): sphincter muscle.
146
Compared to the prior mass-spring models, layered spring mesh models offer better realistic results at the expense of extensive computation. Wu et al (Wu. et al, 1990) proposed a simplified mesh system to reduce the computation time while still maintain visual realism. The skin surface is represented as two dimensional lattices of three dimensional points. Connective fat tissue between the skin and muscle is simulated as spring force constraints acting between points. With an elastic process, the simulated skin surface is deformed with the muscle contraction and produces expressive wrinkles in facial animation and skin aging.
147
(RFFD) adds extra degree of freedom in specifying deformations by incorporating weight factors for each control point. Therefore, deformations can be done by changing the weight factors instead of changing position of the control points. When all weight factors are equal in RFFD, essentially it becomes a FFD. Although facial expressions may be rendered through muscle activities, due to the complexity of the anatomical structure of a human face, the direct use of a muscle-based model is not a trivia task. Karla et al. (Karla et al., 1991) proposed a multi-layered framework that uses abstract entities to create facial animation. Each level has its own abstract entities, input and output, thus editing of facial expression can be easily manipulated without impacting the other layers. Rational Free Form Deformations (RFFD) (Karla et al., 1992) is implemented to simulate the abstract muscle actions. For a particular muscle action on the face, surface regions corresponding to the anatomical description of the muscle actions are defined. A parallelepiped control volume is then defined on the region of interest. The skin deformations in the volume are simulated by interactively displacing and weighting of the control points. One or several simulated muscle actions, can be formed as a Minimum Perceptible Action (MPA). These MPAs work as the atomic action unit, similar to Action Unit (AU) of the Facial Action Coding System (FACS, as discussed in the next section), to build an expression. Compared to those physically based models, manipulating the weights or positions of the control points is simpler and more intuitive than manipulating muscle vectors with delineated region of influence. However, FFD (EFFD, RFFD) does not provide detailed controlling of the actual muscle and the skin behavior so that it fails to model bulges in the muscle and wrinkles in the skin. This certainly limits their implementation on facial simulation in realistic character animation.
148
FACS is a description of the movements of the facial muscles derived from the analysis of facial anatomy (Figure 8).
FACS analyses a facial expression, decomposing it into the specific action units (AUs) that produce the movement. The original FACS includes 44 basic action units. The revised version in 2002 (http://face-and-emotion.com/dataface/facs/new_version.jsp) increases action units to 64. It also re-assigns the old AU41, AU42, and AU44 scores to intensities of AU43, and revises scoring for AU25, AU26, and AU27 with intensities. The sample single action units are listed in Table 1.
AU 1 4 6 9 12 17 23 25 27
FACS Name Inner Brow Raiser Brow Lowerer Cheek Raiser Nose Wrinkler Lip Corner Puller Chin Raiser Lip Tightener Lips part Mouth Stretch
Muscle Frontalis, pars medialis Corrugator supercilii, Depressor supercilii Orbicularis oculi, pars orbitalis Levator labii superioris alaquae nasi Zygomaticus major Mentalis Orbicularis oris Depressor labii inferioris or relaxation of Mentalis Pterygoids, Digastric
AU 2 5 7 10 15 20 24 26
FACS Name Outer Brow Raiser Upper Lid Raiser Lid Tightener Upper Lip Raiser Lip Corner Depressor Lip stretcher Lip Pressor Jaw Drop
Muscle Frontalis, pars lateralis Levator palpebrae superioris Orbicularis oculi, pars palpebralis Levator labii superioris Depressor anguli oris Risorius w/ platysma Orbicularis oris Masseter, relaxed Temporalis and internal Pterygoid
149
Facial expressions can be created by combining action units. For example, combining the AU1 (Inner brow raiser), AU6 (Cheek Raiser), AU12 (Lip Corner Puller), and AU14 (Dimpler) generates a happy expression. Table 2 lists the sets of action units for the six basic facial expressions.
Involved AUs AU2, 4, 7, 9, 10, 20, 26 AU2, 4, 9, 15, 17 AU1, 2, 4, 5, 15, 20, 26 AU1, 6, 12, 14 AU1, 4, 15, 23 AU1, 2, 5, 15, 16, 20, 26
FACS uses action units, instead of muscles, as the measurement unit of facial behavior. From Table 1 and 2, since an expression may combine several action units to form an appearance, and each action unit may have one or multiple muscles involved, the muscle activity still represents the fundamental entity that animation control can work on. From the perspective of motion control, geometric approaches, either physical muscle models which describe the properties and the behavior of human skin, bone, and muscles, or pseudo muscle models which mimic the dynamics of muscle tissue with heuristic geometric deformations, when parallel with the Facial Action Coding System, they provide good solutions for facial animation, especially when the face geometric deformation is a required criterion.
150
This principle, undoubtedly, is the most important principle in facial animation. It usually emphasizes on the shape deformation cause by the underlying skeleton movement, such as muscle bulge expression of jaw movement. The degree of deformation depicts the strength of either external pressure or its own power. Since geometric manipulation methods try to simulate the muscle activities underlying the face skin, shape deformation can be wisely used to create persuasive animation. Naturally, Squash and stretch can be simply implemented using these techniques. Compared to the other techniques, Geometric manipulation provides the best tool for simulating Squash and stretch, no matter it is physically or pseudo muscle based model, and it is up to the animator to set up the proper parameters to fine tune the facial animation in order to achieve desired Squash and stretch. Timing refers to the tempo of an action. While in body animation, this principle is often used to imply a characters physical weight and size, it is commonly applied to show a characters mental states in facial animation. The best way to implement Timing in geometric
manipulation is through FACS, and leave out the process computation to the underlying simulating mechanism. By controlling the duration of each action unit in a facial expression, facial animation can be spatially and timely displayed. Slow in and Slow out also relates to the tempo of an action. Through the pace down during the beginning and ending of an action, a natural movement is insured. Though Slow in and Slow out property in facial animation is not visually as obvious as in body animation, its importance can not be overlooked. As any abrupt moving during the start or end of a facial action may attract viewers for wrong translation of the facial expression. For interpolation and parameterization approaches, Slow in and Slow out can be easily implemented through cubic interpolation method. However, for muscle-based approaches, it will require extra effort from animators in order to preserve this motion property. Both Timing and Slow in and Slow out affect the pace of action only. Nevertheless, they are unprecedented factors for geometric manipulation systems to provide better artistic control of facial animation. The main action of facial animation usually has the major Squash and stretch. Principle Anticipation which occurs before the main action and principle Follow through and Overlapping Action which follows the main action are the two principles that animators used to help telling the clarity and dramatics of the main action. Since Squash and stretch in facial animation seldom has fast movement like those main actions in body movement, the scale of Anticipation and Follow through and Overlapping Action usually is in small magnitude.
151
Geometric manipulation, when coupled with FACS, provides animators an intuitive tool in producing these two animation criteria. With punctilious setting in the time domain, both principles can be subtly displayed. However, special attention and effort may need when applying Follow through and Overlapping Action, as some of the parameters will conflict with each other or themselves during the overlapping period of actions. Arcs describes the trajectory of a natural motion in body motion, and Secondary Action refers to the subsidiary motions caused by the main action. Both principles are used to enhance the main action, and require the animator to explicitly define them in the animation. Due to the
geometric structure of facial model and the kinematics constraints of each facial attribute, Arcs characteristic of skin surface features are natural resultant motions in facial animation. On the other hand, Secondary Actions usually needs animator to intentionally add them, especially for those methods using interpolation and parameter approaches. Exaggeration is an action that can make the movement seem more dramatic. It is widely used in entertainment to capture the interest of an audience, and the most important animation principle, other than Squash and stretch, in facial animation. Since geometric manipulation has the geometric data in details, it gives the animator the ultimate control over the facial animation. Exaggeration can be done in several ways: exaggerating the keys in interpolation methods; amplifying the animation parameters in parameterization methods; scaling physical or vector-based structure in muscle methods; even different domain, such as time duration of action units in FACS system. Compared to the other techniques, geometric manipulation provides an intuitive and effective way to create exaggeration. However, the animator should be careful not to over exaggerate things because that tends to make the facial expressions seem bizarre. Appeal refers to making the animation that the viewers want to see. Much of this is the proper use of the other animation principles. Appeal can be achieved through design, simplicity, and behavior of characters. In general, geometry manipulation techniques provide animators an unsurpassed tool in creating appeals in the scene. However, the duplicate property of digital computers in computer animation, symmetry often appears in characters modeling and animation, thus reduces appeal of the scene. This symmetrical problem is particular apparent in geometry manipulation approaches. For examples, symmetry of geometry between the vertical halves of the face model, and symmetry of feature movements caused by animation control mechanisms, such as interpolation, parameterization, and muscle-based approaches.
152
Performance-driven Techniques
The difficulties in creating complex facial expressions lead to the performance driven approaches where actions of a human face are recorded and used to control the facial animation. Advantages of performance driven techniques include significant saving of time and labors over manually crafted animations, and the potential of producing more realistic facial animation. There have been many approaches exist for performance driven facial animation. Most of these methods track facial markers from real actors, recover the 2D or 3D positions of these markers, and filter or transform to generate the motion data. These data are further used to directly generate facial expressions or to infer action units of FACS in creating facial animation.
153
wire-frame face model, and reproduce facial expression. The major weakness of this system is that it requires that facial features be highlighted with make-up for successful tracking. Although the active contour models are used, the facial structure is still passively shaped by the tracked contour features without any active control. More recent, Lee et al. (Lee et al., 2000) use a structured snake method to fit curves to features. The face feature shapes are extracted in their method. However, the animator has to locate each terminal point of feature sets manually. Thus, real time processing will be impossible. Various approaches (Patterson et al., 1991; Moubaraki et al., 1995; Ohya et al., 1995; Bascle and Blake, 1998) have adopted computer vision for performance driven facial animation. Most simple ones use colored markers painted on the face or lips of the actors to simplify and aid in tracking facial expressions or recognizing speech from video sequences. However, markings on
the face are somehow interfering and often impractical. To obviate the need for intentional markings on the face, approaches using optical flow (DeCarlo and Metaxas, 2000) and spatio-temporal normalized correlation measurements (Darrell and Pentland, 1993) provide another solution for natural feature tracking.
154
Different from Williamss approach (Williams, 1990) which use a single static texture image of a real persons face and tracked points only in 2D, Guenter et al. (Guenter et al., 1998) use a large set of sampling points on the face to track the 3D deformations of the face. Other than tracking of the geometric data, multiple registered video images of the face are also captured to create to create a texture map sequence for the polygonal model of the face. The resulting facial animation looks quite life-like. Moreover, with the true 3D face geometry and texture information, it is more suitable for 3D virtual environment applications than conventional video. However, their system makes little use of existing data for animating a new model. Each time a new model is created, a method-specific tuning is inevitable for animation. Motivated by techniques for retargeting full body animations from one character to another (Gleicher M., 1998), Noh and Neumanns expression cloning work (Noh and Neumann, 2001) directly maps an expression of the source model onto the surface of the target model, and still preserves the relative motions and dynamics of the original facial animations. Instead of creating new facial animations from scratch for each new model created, their method takes advantage of existing animation data in the form of vertex motion vectors. It transfers vertex motion vectors from the original face model to a target model having different geometric proportions and mesh structure. Dense correspondences between the models are computed by using a small set of initial correspondences to establish an approximate relationship. However, identifying initial correspondences will require manual selection of surface points in the face model. 3D shape registration algorithms typically match 3D data of a model shape in a coordinate system onto another shape model in a different geometric shape representation. Based on the iterative closest point (ICP) algorithm, which find the closest point on a geometric entity to a given point, Besl and McKay (Besl and McKay, 1992) proposed a method to register digitized data from rigid objects with an idealized geometric model. Similar work, but with less computation cost was proposed by Zhang (Zhang, 1994) to establish geometric matching between different model geometries. Performance-driven facial animation has been approached with a variety of tracking techniques and deformable facial models. Commercial packages are even available with automated dot tracking. However, existing methods still lack a proper modeling of the motion style in an expression, which results from the combined effect of muscle action, and secondary skin surface details, such as wrinkles.
155
156
the appeal of character animation. All animation principles regarding to motion control, other than Exaggeration, can be easily implemented through performance-driven approaches. Since animation principles describe the natural movement of an object (character), they are mainly used to guide the animators how to animate every detail in the scene. In computer animation, unfortunately, this implies that if the animator does not plan them, they wont exist in the scene. Therefore, it is the animators responsibility to implement these principles in order to create quality animation. On the contrary, performance-driven approaches record the movements of a real actor and transplant them in a different character. Thus, motion details are inherited from the performer, and real motions lead to realistic animation naturally.
Conclusions
Summary
Character animation includes facial animations and body animations, in which the facial animations is much more appealing since the characters feelings and compassions are usually expressed through the facial animations. Up to date, many published works have provided different approaches for various facial expression applications. However, few of them have addressed the issues regarding animators artistic design and motion control of facial animation. In this paper, we describe and survey the techniques associated with facial animation. Animation approaches are organized into categories that reflect the similarities between methods. Three major themes in facial animation are image manipulations, geometry manipulations, and performance-driven methods. From the perspective of artistic expression control, Animation Principles are referred to survey variation of these approaches. Normally, the generation of 3D facial animation can be summarized as follows. After a specific model is obtained, the constructed individual facial model is deformed to produce facial expression based on simulating mechanism. Then, the complete facial animation is performed by animators manual operations, or Facial Action Coding System, or by tracking real actors facial actions.
157
Reference
Arai K., Kurihara T., & Anjyo K., (1996). Bilinear Interpolation for Facial Expression and Metamorphosis in Real-Time Animation. The Visual Computer (12), 105116. Bascle B. & Blake A., (1998). Separability of pose and expression in facial tracing and animation. In Proc. of ICCV, 323328. Baron-Cohen, S., Wheelwright S., Hill J., Raste Y., & Plumb I., (2001). Reading the Mind in the Eyes. Test Revised Version: A Study with Normal Adults, and Adults with Asperger Syndrome or High-functioning Autism. Journal of Child Psychology and Psychiatry, 42 (2), 241-251. Basu S., Oliver N., & Pentland A., (1998). 3D Modeling and Tracking of Human Lip Motions. Proceedings of International Conference on Computer Vision, 337-343.
158
Beier T. & Neely S., (1992). Feature-based image metamorphosis. ACM SIGGRAPH Computer Graphics (26), 35-42. Besl P. & McKay N., (1992). A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239 256. Blinn J. F. & Newell M. E., (1976). Textures and reflection in Computer generated images. Communications of the ACM, Vol. 19 (10), 542-547. Bronstein A. M, Bronstein M. M., & Kimmel R., (2007). Calculus of non-rigid surfaces for geometry and texture manipulation. IEEE Trans. Visualization and Computer Graphics 13(5):902-913. Buenaposada J. M. & Munoz E., (2006). Performance driven facial animation using illumination independent appearance-based tracking. Proceedings of the 18th International Conference on Pattern Recognition, Vol(1), 303 306. Choe B., Lee H., & Ko H., (2001). Performance-Driven Muscle-Based Facial Animation. Journal of Visualization and Computer Animation 12(2), 6779. Cohen M, and D. Massara, 1993, Modeling co-articulation in synthetic visual speech. In N. Magnenat-Thalmann, and D. Thalmann editors, Model and Technique in Computer Animation, 139156, Springer-Verlag, Tokyo. Coquillart S., (1990). Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling. ACM SIGGRAPH Computer Graphics (24), 187 193. Darrell T & Pentland A., (1993). Spacetime gestures. In Computer Vision and Pattern Recognition. Davis J., Ramamoorthi R., & Rusinkiewicz S., (2003). Spacetime stereo: A unifying framework for depth from triangulation. In CVPR03, 359 366. DeCarlo D. & Metaxas D., (2000). Optical flow constraints on deformable models with applications to face tracking. International Journal of Computer Vision, 38(2):99127. Duchenne de Bologne, B., (1990). The mechanism of human facial expression or an electro-physiological analysis of the expression of the emotions, Cambridge University Press, New York, NY. Ekman P. & Friesen W. V., (1978). Facial Action Coding System: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto, CA. Ekman P., (1993). Facial Expression and Emotion. American Psychologist, 48(4), 384-392. Essa I., BasuS., Darrell T., & Pentland A., (1996). Modeling, Tracking and Interactive Animation of Faces and Heads using Input from Video. Proceedings of Computer Animation Conference, 68-79. Geneva, Switzerland, IEEE Computer Society Press.
159
Gleicher M., (1998). Retargetting Motion to New Characters. ACM SIGGRAPH Computer Graphics (35), 33 42. Guenter B., Grimm C., Wood D., Malvar H., & Pighin F., (1998). Making faces. ACM SIGGRAPH Computer Graphics (32), 5566. Huang P., Zhang C., & Chiang F., (2003). High-speed 3-d shape measurement based on digital fringe projection. Opt. Eng., 42(1), 163168. Kalra P., Mangili A., Magnenat-Thalmann N., & Thalmann D., (1991). SMILE: A multi-layered Facial Animation System. Proc. IFIP WG 5, 189198, Tokyo, Japan,. Kalra P., Mangili A., Thalmann N. M., Thalmann D., (1992). Simulation of Facial Muscle Actions Based on Rational Free From Deformations. Eurographics, Vol. 11(3), 5969. Kass M., Witkin A., & Terzopoulos D., (1987). Snakes: Active contour models. International Journal of Computer Vision, Vol. 1(4), 321331. Koch R. M, Gross M H., Carls F. R., Von Bren D. F., Fankhauser G., & Parish Y. I. H., (1996). Simulating facial surgery using finite element models. Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, 421-428. Lasseter J., (1987). Principles of Traditional Animation Applied to 3D Computer Animation. ACM SIGGRAPH Computer Graphics (21), 77-88. Lasseter J., (1994). Tricks to Animating Characters with a Computer. Course Notes #1, Animation Tricks, ACM SIGGRAPH 94. Lee Y. C., Terzopoulos D., & Waters K., (1995). Realistic face modeling for animation. ACM SIGGRAPH Computer Graphics (29), 55-62. Lee W., Kalra P., Magnenat-Thalmann N, (1997). Model based face reconstruction for animation. In Proceedings of MMM97, World Scientific Press, Singapore, pp. 323338. Moubaraki L, Ohya J., & Kishino F., (1995). Realistic 3D Facial Animation in Virtual Space Teleconferencing. 4th IEEE International workshop on Robot and Human Communication, 253-258. Noh J. & Neumann U., (2001). Expression cloning. ACM SIGGRAPH Computer Graphics (35), 277288. Ohya J., Kitamura Y., Takemura H., Ishi H., Kishino F., & Terashima N., (1995). Virtual Space Teleconferencing: Real-Time Reproduction of 3D Human Images. Journal of Visual Communications an Image Representation, Vol. 6, 1-25. Oka M., Tsutsui K., Ohba A., Jurauchi Y., & Tago T., (1987). Real-time manipulation of texture-mapped surfaces. ACM SIGGRAPH Computer Graphics (21), 181-188.
160
Parke F. I., (1972). Computer Generated Animation of Faces. Proceedings of the ACM Annual Conference, 451-457. Parke, F. I., (1974). A parametric model for human faces. PhD thesis, University of Utah, Salt Lake City, UT. Tech. Report UTEC-CSc-75-047. Parke F. I., (1982). Parameterized models for facial animation. IEEE Computer Graphics and Applications, Vol. 2(9), 61 68. Parke F. I., (1989). Parameterized models for facial animation revisited. In ACM SIGGRAPH Facial Animation Tutorial Notes, 5356. Parke F. I. & Waters K., (1996). Computer facial animation. AK Peters, Wellesley, MA, ISBN 1-56881-014-8. Patterson E, Litwinowicz P., Greene N., (1991). Facial Animation by Spatial Mapping. Proc. Computer Animation, Springer-Verlag, 31-44. Pieper S., Rosen J., & Zeltzer D., (1992). Interactive Graphics for plastic surgery: A task level analysis and implementation. Computer Graphics, Special Issue: ACM SIGGRAPH, 1992 Symposium on Interactive 3D Graphics, 127134. Pighin F., Auslander J., Lischinski D., Salesin D. H., & Szeliski R., (1997). Realistic Facial Animation Using Image-Based 3D Morphing. Technical report UW-CSE-97-01-03. Pighin F., Hecker J., Lischinski D., Szeliski R., & Salesin D. H., (1998). Synthesizing Realistic Facial Expressions from Photographs. ACM SIGGRAPH Computer Graphics (32), 75-84. Pighin F., Szeliski R., & Salesin D. H., (1999). Resynthesizing facial animation through 3D model-based tracking. Proceedings of International Conference on Computer Vision, 143150. Platt S, & Badler N., (1981). Animating facial expression. ACM SIGGRAPH Computer Graphics (15), 245-252. Rusinkiewicz S., Hall-Holt O., & Marc L., (2002). Real-time 3D model acquisition. ACM SIGGRAPH Computer Graphics (36), 438 446. Sederberg T. W. & Parry S. R., (1986). Free-Form deformation of solid geometry models. ACM SIGGRAPH Computer Graphics (20), 151-160. Scheepers F., Parent R. E., Carlson W. E., & May S. F., (1997). Anatomy-based modeling of the human musculature. ACM SIGGRAPH Computer Graphics (31), 163-172. Sifakis E., Selle A., Robinson-Mosher A., & Fedkiw R., (2006). Simulating speech with a physics-based facial muscle model. Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation, 261-270.
161
Stein N. L. & Oatley K., (1992). Basic emotions: Theory and measurement. Cognition and Emotion, 6, 161-168. Terzopoulos D. & Waters K., (1990). Physically-based facial modeling, analysis, and animation. J. of Visualization and Computer Animation, Vol. 1(4), 73-80. Terzopoulos D. & Waters K., (1993). Analysis and synthesis of facial image sequencesusing physical and anatomical models. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(6):569579. Thalmann N, Cazedevals A., & Thalmann D., (1993). Modeling Facial Communication Between an Animator and a Synthetic Actor in Real Time. Proc. Modeling in Computer Graphics, 387-396, Genova. Thomas F. & Johnston O., (1981). The Illusion of Life. Abbeville Press, New York. Wang, Y., Huang X., Lee C., Zhang S., Samaras L., Metaxas D., Elgammal A., & Huang P., (2004). High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. Computer Graphics Forum (23), 677-686. Waters K., (1987). A muscle model for animating three-dimensional facial expression. ACM SIGGRAPH Computer Graphics (21), 17-24. Williams L., (1990). Performance-driven facial animation. ACM SIGGRAPH Computer Graphics (24), 235242. Yanagisawa H., Maejima A., Yotsukura T., Morishima S., (2005). Quantitative representation of face expression using motion capture system. In ACM SIGGRAPH 2005 Posters. Zhang Z., (1994). Iterative point matching for registration of freeform curves and surfaces. International Journal of Computer Vision, 13(2):119152. Zhang L, Curless B., & Seitz S., (2003). Spacetime stereo: Shape recovery for dynamic scenes. In CVPR03, 367374.
162
!
! !
!
-!
!!!