You are on page 1of 6

Augvox an augmented voice instrument

Brian Tuohy
Sonic Arts Research Centre Queens University Belfast BT7 1NN, UK btuohy02@qub.ac.uk

ABSTRACT
As a musical instrument, the human voice offers significant range in terms of sonic results. Control of this particular instrument is aided by a natural familiarity, and it takes little concentration to quickly produce sounds of vastly differing musical characteristics. The advent of computer music has afforded manipulative processing that delivers auditory results that would never be attainable in the natural world. To this end, we now have the capabilities to process the human voice in order to facilitate impossible dynamic ranges, stretch notes far longer than would be naturally conceivable, or create abstract sounds, extrapolated far beyond the original vocal products we create. This project attempts to bridge the gap between natural generation of vocal sounds, and complex digital processing by developing a simple, ergonomic instrument that places control of the sonic output in the hands of the musician.

2.1 Design
In order to facilitate the integration of several electronic components, the instrument housing had to be purpose-built. This also facilitated much greater control over the ergonomic and aesthetic properties of the instrument something that was of great concern in the design process. In an attempt to encourage the most straightforward interactions possible, the instrument needed to be designed so that the controls would sit seamlessly under the musicians fingers and would not require excessive cognitive load to operate. For this reason, the sensors used sit under each of the fingers on the left hand exactly where they rest naturally. The instrument is presented in the form of a dynamic microphone with four force sensitive resistors (FSRs) set along the front of the body. Each of these is so-placed in order to be acted upon by the four fingers of the left hand. The thumb controls the orientation of a joystick sunken into the side of the body.

Keywords
Augmented Microphone, Interaction Processing, Vocal Manipulation Design, Spectral

1. INTRODUCTION
This paper provides an overview of the Augvox project an interaction design project that explores the development of an augmented microphone instrument, in order to facilitate extended vocal capabilities and introduce the voice as an input to a physical instrument for the creation of computer music. Throughout the following sections, a background to the project will be provided, followed by a report on the development stages this will include design sketches and photographs of early physical prototypes. This will be followed by a section explaining the interactions the instrument affords and the sonic processes that help shape the sound of the instrument. Finally, a discussion on the issues encountered throughout the project will be provided, followed by conclusions and intentions for further work.

Figure 1. Conceptual sketch of initial instrument design Figure 1 shows the joystick and four FSRs in their initial design location. Designing the correct placement of these sensors was essential in attempting to properly capture the squeezing gesture that is associated with putting significant force on the FSRs. For this same reason, it was important to correctly gauge the optimal body size to allow for the highest range of force to be exerted on the sensors. These design questions were approached with the first physical prototypes as shown in Figure 2 and Figure 3.

2. BACKGROUND
The design of this instrument was built upon that of a dynamic microphone. While the idea was to create a familiar interface with additional control mechanisms, the intention was not simply to provide an augmented interface for an existing piece of equipment. In order for this instrument to achieve its proposed goals, it must be considered as an instrument unto itself, not simply a microphone with additional functionality. In this regard, the instrument could be considered as a kind of modular system, consisting of the input source, interface, sound engine, and resulting sound output. The input, here, is the vocal product. The sounds that the musician creates pass through the rest of the system and are altered by the musicians interactions with the instrument in much the same way that a trumpet players breath is passed through the valves of their instrument and acted upon by the fingers pushing or releasing the buttons.

Figure 2. First prototype, showing approximate size to fit hand Figure 2 shows the first physical prototype, which plots the position of each of the fingers while gripping, in order to provide for the greatest degree of control when acting upon the FSRs and joystick. Figure 3 integrates the proposed sensors in attempt to further visualise the intended interactions with the instrument.

Figure 4. Early design sketch, showing instrument design and location of sensors

2.2 Design guidelines


Some important design considerations were consulted throughout the planning and development stages for this instrument. This section lists some guidelines that were considered when making design decisions for this instrument. Affordances [6] - Design must make it obvious how the musician should use the instrument and the intended function should be comfortable and natural. Visibility of system status [5] - It is important for the musician to know that what they are doing is having some effect, and that the system is working particularly for the use of non-tactile sensors such as the range finder. Copying an instrument is dumb, leveraging expert technique is smart/Some players have spare bandwidth, some do not. [2] - Building on techniques with which the musician is already familiar may make it easier to learn how the instrument works and lead to more pleasing results. Using an input source that requires low cognitive strain allows for more concentration on other tasks. A visual component is essential to the audience, such that there is a visual display of input parameters/gestures. The gestural aspect of the sound becomes easier to experience. [7] - The interactions with the instrument should not be so subtle that they are unnoticeable. Right-hand movement and left-hand fingering should be visible to the audience, without seeming exaggerated or impeding performance capabilities.

Figure 3. First prototype, demonstrating position of FSRs and joystick These sensors are further complimented by a rangefinder placed along the side of the instruments body. This sensor was added to allow further control of the instrument with the right hand. The basic design of the instrument was set out to mimic several existing interfaces. In this regard, the FSRs under the fingers are intended to be similar to the holes on a wind instrument such as a flute. The actions of the fingers on the left hand are also similar to those of a guitarist or violinist, where pitch is controlled by the position of the fingers on the interface. Similarly, the entire body of the instrument is intentionally designed to be akin to that of a dynamic microphone. This is to encourage familiarity and suggest natural interactions from the musician, such as singing or creating other vocal content. All of the early design intentions can be seen in Figure 4, which shows a detailed plan of the structure of the instrument and sensor location.

2.3 Design Alterations


As the project progressed, several changes in the design were implemented, although the overall design and interactions stayed largely the same. One of these changes was the choice to

use an ultrasonic rangefinder instead of an infrared sensor to gauge the interactions of the right hand. This decision was made in order to circumvent input value problems with the IR sensor at close range. The IR sensor can measure up to approx. 10cm before it experiences a flip in values and begins to function in the opposite direction to what may be anticipated. The ultrasonic sensor, however, will continue to give accurate distance values in mm up to 10mm away from the sensor itself, which offers more than enough accuracy for the intended application. The lip that is visible in Figure 4 was reconsidered in order for a different method of support to be implemented. This method involves the musician wearing a glove that has an elastic section into which the microphone can be placed. The intention of this is to allow for the musician to eliminate the need to consciously hold the instrument instead it would be attached to the glove and worn, making it possible to properly support the microphone without affecting any of the sensors. This type of support is shown in Figure 5. Figure 6. Papier Mach prototype of instrument design Using this kind of physical model, it was much easier to imagine and test the interactions with the instrument. While this model allowed for conceptual interactions, it did not permit actual integration of electronics due to its fragile nature. The next prototype was a more solid construction, having been fashioned from a section of plastic piping. With the electronics mounted to this prototype, it was possible to get a much more accurate representation of the interactions with the final instrument. Figure 7 shows this prototype with the joystick and FSRs mounted. With an instrument that relies on such physical control mechanisms, it is possible to derive a great deal of tactile feedback from the instrument itself. In this sense, the instrument becomes much more similar to a traditional acoustic instrument, in that the musician can feel the effect they are having on the instrument. The joystick gives a physical and visual indication of its position at all times and provides sufficient resistance to make the musician aware of the action they are imparting on the sensor. In a similar sense, the musician can feel how much pressure they are putting on the FSRs due to the resistance of the instrument body itself.

Figure 5. Microphone supported with material attached to glove.

3. Prototyping 3.1 Physical Prototyping


Much of the motivation behind this project came from the desire to create a tangible instrument that would allow the musician a great deal of physical interaction and control over the sonic parameters concerned. This emphasis on physical interaction led to several stages of prototyping, beginning with the cardboard models shown above with interactions zones sketched on and sensors attached. The next stage in physical prototyping was the creation of a more accurate papier mach model, shown in Figure 6. This model is much closer to the intended dimensions for the final instrument design and also illustrates the positioning of the sensors.

Figure 7. Physical prototype with Joystick and FSRs

3.2 Electronics and Coding


Electronic prototyping began with basic input of FSR data through the Arduino analog inputs. The wiring for these circuits works along the premise of a simple potential divider, with the analog input coming from between one leg of the FSR and a grounded 10K resistor.

This took very little prototyping and was a quick method of getting multiple inputs into Arduino. Communication between the Arduino and Max involved breaking the input value up into two sections the first three bits, and bottom 7 bits of a 10-bit number (Figure 8). This was because the range of input values (0 1023) was too much to send to Max as a byte, 8 bits (0 255). This way, the full range of the input values could be encoded in Arduino, and then decoded in Max.

Figure 10. Joystick attached to breakout board

Figure 8. 10-bit serial communication encoding in Arduino Much of the rest of the wiring was straightforward switches. The distance value on the rangefinder was mapped to the colour and intensity of an RGB LED, changing from dim green to bright red as the distance from the sensor increases. The initial intention was to use multiplexing to allow for the two inputs required by the joystick, four FSRs, and the ultrasonic sensor when there are only six available analog inputs on the Arduino. However, the ultrasonic sensor employed functions by using a PWM output from the Arduino to trigger a signal, and uses another PWM input to return a value representing distance of the closest object to the sensor. Figure 9 shows the working breadboard prototype of the four FSRs, ultrasonic sensors, three-way switch, and RGB LED.

4. INTERACTION AND SONIC PRODUCTS


Many of the interactions with this instrument are intended to mimic those of acoustic instruments. In this sense, the input values are mapped on a greater than one-to-one basis such that, for example, a high rate of change calculated with the FSRs will cause an increase in high frequency content passed through a filter. Another desirable characteristic as set out at the beginning of this project was a dampening with the left hand that would mimic the deadened/complex harmonic sound achieved when the fingers of the fretting hand rest on the strings of a guitar. Just as acoustic instruments are built upon systems that are far more complex than simple one-to-one mappings, so too was the intention of this instrument to develop a layered system of dependencies without tending towards overcomplicated patterns or interactions.

4.1 Transposition and expressivity


The most obvious interaction with this instrument begins with the FSRs. The force sensitive resistors are responsible for two distinct tasks. Firstly, these sensors are used as a means of indicating a transposition factor for altering the pitch of the input. This is afforded by making a binary pattern across all of the FSRs. By programming the Max patch to detect whether or not there is any force on the sensor, each FSR has two possible states; on and off. With four sensors, this allows for a total of 16 possible binary patterns. This is designed in the same manner as binary digits in computing, with the bottom FSR (FSR 4) representing the smallest, right-most bit. Each state represents the number of semitones by which the original input would be transposed. In order to further the range of transposition possible with this instrument, the joystick button is implemented as a shift control. We can attain a further 16 transposition states by considering whether or not the joystick button is pressed. As a result of this addition, we have a total range of 33 semitones, 16 up, 16 down, and the original sound. These binary patterns, however, do not represent actual musical notes, but instead simply indicate the relationship between the original sound and the transposed version. Much of the concept of this instrument is designed upon this idea of relationships between moving sound objects rather than set, definable notes. Because of this notion of relationships, the binary patterns are mapped out in a way such that the transposition pattern on either side of the original sound mirrors that of the same fingering pattern on the other side, i.e. where 0010 with the joystick button not pressed indicates a transposition of +2 semitones, 0010 with the joystick button pressed represents -2 semitones of transposition. While this

Figure 9. Electronics wiring prototype of FSRs, rangefinder, switch and RGB LED Prototyping the joystick was essentially just a case of connecting the wires to the breakout board (Figure 10) and inputting these values into Arduino. The advantage of using a control such as this is that it provides a comfortable method of controlling multi-dimensional parameters. There is also a push switch located below the joystick, which allows for further control using a single digit.

might seem counterintuitive from the perspective of acoustic instruments such as woodwinds, where fingering patterns will move sequentially through the available notes, one must consider the fact that the input to the Augvox system is not set, and hence any related streams of sound should work outward, using the initial input as a reference point. The other function of the FSRs is to measure expressive gestures. As a result of the FSRs' primary role as binary pattern indicators, it becomes impractical to utilise each sensor as an individual expressive control. Hence, a new relationship must be defined in order to extract variable data from all four sensors simultaneously as an "expressive section" of the instrument. While calculations such as mean input value, rate of change, and extreme high and low force are attained from each sensor, it is the average of these values across each active sensor that is mapped to the manipulative processes of the system. For example, in order to gauge the amount of force being input by the musician, we add the force from each active sensor and take an average. In this way, we can still calculate the overall interaction with the force sensors regardless of whether the binary pattern is 1111 or 0001. A comparison was made early in this project to the left-hand movements of a guitarist, and how the force exerted on the strings can cause vastly differing tones such as the difference between open harmonics and a cleanly fretted note. This concept was built upon in the design of this instrument. The intention here was to section off two extremes on the force scale and utilise the data from within these sections to drive vastly different sonic processes. The first of these is an overdrive process, which would begin to occur at 80% total force. This process allows us to harness the "squeezing" gesture of the musician to represent an action similar to over-blowing a wind instrument or striking strings hard enough to induce a buzzing sound. The overdriven sound is achieved by passing the input through a custom made bit-crusher, which uses the degrade~ object to reduce the effective sampling and bit rates. Similarly, a "muting" action is implemented below 25% force, whereby the filtering significantly reduces the high frequencies of the input sound using both a onepole~ and biquad~. This filtering is designed to be indicative of the deadening of a string such that many of the harmonics cannot be heard.

next process implemented is convolution of the input with simple frequency modulation. This allows the musician further control of the harmonic content of the sound. Control of this process is attained using the horizontal axis of the joystick. The values from the sensor are mapped to "richness" of the sound both by altering the mix between original and convolved sound, and by manipulating the modulation index and harmonicity factor of the frequency modulation. The intention here is to produce a base level of modulation while the joystick is at rest, and to use the sensor to control complexity of the sound, similar to how the sustain pedal on a piano will allow the sound to be opened up to produce more harmonics. In a further attempt to build upon the harmonic and expressive control of the sound, the vertical axis of the joystick is mapped to both the filter and mix level of a modulating delay. The use of the two axes of the joystick in this way allows for vastly differing sonic results afforded, for example, by the bottom left and top right positions. Further modulation takes place in the tremolo subpatch, which is intended to add a more natural sounding fluctuation in pitch. At this point, the frozen stream breaks off, allowing the musician to "grab" the sound as it has been processed until this point. The [speed] sub-patch uses fftz.residency~ to constantly sample the last 1000ms of the transposed stream. When the musician begins to interact with the ultrasonic sensor, this sampling stops and the last 1000ms of sound is used like a sampled chord, the speed of which is controlled by the distance from the sensor. By placing the right hand directly upon the sensor, the sound is frozen. As the musician's hand moves further away from the sensor, the speed of the sound increases until it finally reaches its original rate. The intention of this interaction is to allow for a sonic constant, similar to a chord, that accompanies the melodic transpositions incurred by the left hand. The right hand movement itself could be likened to the closing down of the sound, giving the sampled content less space to move freely, as the distance between the hand and the sensor decreases. Once there has been no interaction with the sensor for 1000ms, the sound decays and the automatic sampling process recommences.

4.2 Multiple Audio Streams


Building upon the concept of sonic relationships, this instrument produces three distinct streams of content. These streams will be referred to as the original, transposed, and frozen streams. The first stream is the original input, which passes through several filtering and modulating processes. This stream is included in the final output by moving the switch on the back of the body to the upward position. The second stream is the pitch-altered derivative of the original input, transposed according to the binary pattern indicated by the FSRs and joystick button. The third stream is a "frozen" section controlled by the interaction of the right hand with the ultrasonic sensor. Each of these streams breaks off at a different point in the signal chain and enters a number of individual processing stages. The common processes for each of the streams are as follows: lowpass filter -> compressor -> overdrive -> equalisation From this point, the original sound enters the bit crusher process and then continues to a common delay and reverb section. The further processing of the transposed stream begins with an fftz.tuner~ object, which pitch-shifts the input according to a transposition factor from the FSRs. In order to allow greater control of the harmonic content of the sound, the The RGB LED beside the ultrasonic sensor (Figure 11) indicates the distance between the sensor and the hand by altering the colour from red to green as the distance decreases. This visual feedback is an important feature as the angle of transmission from the sensor widens significantly with distance, reducing the musician's ability to identify the proper location in space to allow interaction. Some minor alterations to

Figure 11. Final Instrument with sunken sensors and RGB LED

the Arduino code are intended in order to improve upon the response of the RGB LED, particularly when the right hand enters or leaves the beam of the sensor while at a significant distance. In order to minimise interference from surrounding objects during performance, the ultrasonic sensor was coded to not detect any objects at a distance greater than 600mm.

5. Discussion & Issues


Some issues were encountered during the development of this instrument. Firstly, having completed the final version of the physical instrument, some sensor interference was noted. The main issue here was a bleed of values across axes of the joystick. While the horizontal axis functioned as intended, the vertical axis also affected the horizontal axis to a certain extent, particularly when the joystick was moved down from the centre position. This was determined to be an issue with the component itself, however as it had already been soldered and glued in place, a utilization of the bleeding was implemented into the mapping. As the downward movement of the joystick represented a deadening of the sound by reducing the delay parameters, the leftward movement of the horizontal axis also deadened the sound by reducing the amount of harmonics produced with the fm synthesis. The pairing of the two processes simply allows simultaneous manipulation without compromising the other functions of the sensor. Another issue with this instrument is the operation noise that occurs from using the instrument. The most significant contributor to this operation noise is the joystick button, which causes a click to be picked up every time it is pressed. This issue was predicted in the design process, but, from experimentation with the instrument, was not seen as a significant problem. Care can be taken to press the button gently and the sound becomes processed with the original stream to become a part of the instruments sound. This could also be seen as a potential tool for percussive sounds. A noticeable inconvenience in operating this instrument is the necessity to calibrate each sensor every time the Max patch is opened. While this can be frustrating, its benefits have been determined to be significant enough to keep this system. This action is comparable to tuning an instrument before performance and helps to ensure that the sensors are working and tailored to the ranges of the particular musician. A final issue was noticed concerning the handling of the instrument. The initial intention was to develop a glove with an elastic section in which the instrument would be suspended. This would free the musician of the task of supporting the instrument with the fingers and thumb of the left hand. This glove has not yet been designed for this project and, as predicted, there are some minor issues with operating and handling the instrument simultaneously. This can be largely overcome with practice and creative techniques, but the glove is still intended to be investigated further in the near future.

future. In particular, the physicality of the instrument proved to be both a challenging element to develop and a rewarding path to follow due to the connection afforded between musician and instrument. Sonically, the intention of this instrument was to create drawn out, layered tones, which tend towards an overdriven quality. In this sense, a dissonant balance can be created between the melodic drones of the frozen stream and the distorted bit-crushing of the transposed stream. By implementing multiple versions of the sound, the musician is afforded the ability to create complex sonic layers, while maintaining the distinct advantage of total control over the input sound.

6.1 Further work


While the instrument is quite developed in its current state, it should also be pointed out that there is much more potential for designing meaningful and expressive interactions. A significant amount of data was extrapolated from each sensor in order to facilitate simple but effective mappings. Much of the data calculated in the patches was not mapped to sonic parameters in this version of the project. The intention for future versions is to implement mappings such as a constant pattern in FSR rate of change to tremolo rate to allow for more control over small variations in the sound. Also briefly explored in this project was the use of vocal analysis to control manipulation parameters. In this way, characteristics of the original input such as frequency and amplitude could be implemented as integral factors in the further processing of the audio. Early experimentation with the analyzer~ object proved too computationally expensive to pursue any further in the given timeframe. However, with proper investigation it is possible that this could prove a successful route for further developing the capabilities of the instrument.

7. REFERENCES
[1] Charles, J.F. 2008. A tutorial on spectral sound processing using Max/MSP and Jitter. Computer Music Journal 32, 87-102. . [2] Cook, P. 2001. Principles for designing computer music controllers. Proceedings of the 2001 conference on New interfaces for musical expression. pp. 14. [3] Cook, P.R. 2004. Remutualizing the Musical Instrument: Co-Design of Synthesis Algorithms and Controllers. Journal of New Music Research 33, 315-320. http://www.tandfonline.com/doi/abs/10.1080/0929821042 000317877. [4] Hunt, A., Wanderley, M., & Paradis, M. 2003. The importance of parameter mapping in electronic instrument design. Journal of New Music Research 32, 429-440. [5] Nielsen, J. 1994. Heuristic evaluation. In Nielsen, J., and Mack, R.L. (Eds.), Usability Inspection Methods, John Wiley & Sons, New York, NY. [6] Norman, D.A. 2002. The Design of Everyday Things. Basic Books, New York. [7] Schloss, W.A. 2003. Using contemporary technology in live performance: The dilemma of the performer. Journal of New Music Research 32, 239-242.

6. Conclusion
This paper has provided a review of the development of the Augvox project. Project motivation has been discussed, and early sketches and prototypes from different stages of the design process have been presented. An overview of the physical and electronic design of the instrument has been explained and interactions and sonic processes have been clarified. Where possible, references have been included to ideas that may have motivated certain stages of the development or techniques that are being considered for the

You might also like