Application of The Fractal Market Hypothesis

ISAST Transactions on
No. 1, Vol. 2, 2008 (ISSN 1797-2329)
Electronics and Signal Processing

Regular Papers
Hamid Z. Fardi:
Modeling and Characterization of 4H-SiC Bipolar Transistors ...................1
Yun Zhang, Mei Yu and Gangyi Jiang:
Evaluation of Typical Prediction Structures for Multi-view Video Coding 7
Adriana Serban, Magnus Karlsson and Shaofang Gong:
Microstrip Bias Networks for Ultra-Wideband Systems............................16
Jonathan M. Blackledge:
Multi-algorithmic Cryptography using Deterministic Chaos with
Applications to Mobile Communications...................................................21
Magnus Karlsson and Shaofang Gong:
Monopole and Dipole Antennas for UWB Radio Utilizing a Flex-rigid
Structure......................................................................................................59
Magnus Karlsson and Shaofang Gong:
Monofilar spiral antennas for multi-band UWB system with and without
air core ........................................................................................................64
Jean C. Chedjou, Kyandoghere Kyamakya, Van Duc Nguyen, Ildoko
Moussa and Jacques Kengne:
Performance Evaluation of Analog Systems Simulation Methods for the
Analysis of Nonlinear and Chaotic Modules in Communications .............71
Adriana Serban, Magnus Karlsson and Shaofang Gong:
A Frequency-Triplexed RF Front-End for Ultra-Wideband Systems
3.1-4.8 GHz.................................................................................................83
Jonathan M. Blackledge:
Application of the Fractal Market Hypothesis for Macroeconomic Time
Series Analysis............................................................................................89
Pr Hkansson, Duxiang Wang and Shaofang Gong:
An Ultra-Wideband Six-port I/Q Demodulator Covering from 3.1 to 4.8
GHz ...........................................................................................................111
Greetings from ISAST

Dear Reader,
You have the first ISAST Transactions on Electronics and Signal Processing on your hands. It
consists of ten original contributed scientific articles at the various fields of intelligent systems.
Every article has gone through peer-review process.
ISAST - International Society for Advanced Science and Technology was founded in 2006
for the purpose of promote science and technology, mainly electronics, signal processing,
communications, networking, intelligent systems, computer science, scientific computing, and
software engineering, as well as the areas near to those, not forgetting emerging technologies and
applications.
To show, how large the diversity of computers and software engineering field is today, we
shortly summarize the contents of this Transactions Journal: Hamid Z. Fardi has a research paper
about a new model for design and optimization of 4H-SiC bipolar transistors. Yun Zhang, Mei Yu
and Gangyi Jiang have analyzed and evaluated different typical multi-view video coding schemes in
their research paper. On the other hand Adriana Serban, Magnus Karlsson and Shaofang Gong have
a study about optimizing broadband microstrip bias networks to reduce resonance in broadband RF
circuits. Jonathan M. Blackledge has an extended paper about multi-algorithmic cryptography using
deterministic chaos. Magnus Karlsson and Shaofang Gong have presented two research papers
about ultra wideband antenna design. The first paper discusses circular monopole and dipole
antennas utilizing a flex-rigid structure and the second paper discusses spiral antennas with and
without air core structure. In the paper of Jean C. Chedjou, Kyandoghere Kyamakya, Van Duc
Nguyen, Ildoko Moussa and Jacques Kengne have evaluated different analog systems simulation
methods for their performance. They have used these methods to investigate nonlinear and chaotic
dynamics in communication systems. Adriana Serban, Magnus Karlsson and Shaofang Gong have a
novel design of a RF front-end for multiband and ultra wideband systems that has fully integrated
filter, triplexer network and a flat gain low noise amplifier. In his second extended paper Jonathan
M. Blackledge introduces an idea about using fractal market hypothesis in macroeconomic time
series analysis. Finally Pr Hkansson, Duxiang Wang and Shaofang Gong study a six-port I/Q
demodulator that covers ultra wideband spectrum.
We are happy to see how much we have obtained manuscripts with ambitious and impressive
ideas. We hope that you will inform of the existence of our Society to your colleagues to all over
the academic, engineering, and industrial world.
Best Regards,
Professor Timo Hmlinen, University of Jyvskyl, FINLAND, Editor-in-Chief
Professor Jyrki Joutsensalo, University of Jyvskyl, FINLAND, Vice Editor-in-Chief
M.Sc. Simo Lintunen, University of Jyvskyl, FINLAND, Co-editor
Regular Paper
Original Contribution
1
ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Fardi H.Z.: Modeling and Characterization of 4H-SiC Bipolar Transistors
Modeling and Characterization of 4H-SiC

Bipolar Transistors
H. Z. Fardi, Senior Member, IEEE
AbstractPISCES-IIB two dimensional device simulation

program is used to model the behavior of 4H-SiC bipolar
junction transistors. The physical material parameters in
PISCES such as carriers mobility and lifetime, temperature
dependent bandgap, and the density of states are modified to
accurately represent 4H-SiC.
The simulation results are
compared with the measured experimental data obtained by
others. The comparisons made with the experimental data are
for two different devices that are of interest in power electronics
and RF applications. The simulation results predict a dc current
gain of about 25 for power device and a gain of about 20 for RF
device in agreement with the experimental data.
The
comparisons confirm the accuracy of the modeling employed.
The simulated current-voltage characteristics indicate that
higher gain may be achieved for 4H-SiC transistors if the leakage
current is reduced. The simulation work discussed in this paper
complements the current research in the design and
characterization of 4H-SiC bipolar transistors. The model
presented will aid in interpreting experimental data at a wide
range of temperatures. This paper reports on a new model that
provides insight into the device behavior and shows the trend in
the dc gain performance important for the design and
optimization of 4H-SiC bipolar transistors operating at or above
the room temperature.
based on the electronic properties of 4H-SiC, a thorough

knowledge of their transport properties is needed. While some
of this information is already available through device
measurements and characterization, device modeling
investigates an in-depth analysis of dc gain in 4H-SiC bipolar
transistor operating at and above room temperature. By means
of device modeling the device structure and material
parameters can be related directly to measurements and
physical parameters; the model is hence valuable in
interpreting experimental data at a wide range of temperatures.
The simulation work discussed in this paper complements the
current research in the design and characterization of 4H-SiC
bipolar transistors [6, 7, 8].
Advances in the processing and characterization of SiC
devices further demonstrate the need for device optimization
and design through the use of accurate device modeling.
Some of these efforts are already established. For example,
field-dependent carrier mobility model as a function of
temperature and concentration is presented in [9], where a
simplified analytical expression can model accurately the
field-velocity relationship in SiC devices similar to the
progress made in Silicon materials and devices.
Index Terms4H-SiC, BJT, dc gain, base resistance.

I.
INTRODUCTION
There has been considerable interest in Silicon Carbide (SiC)

bipolar transistors for high power high temperature and high
frequency applications due to its superiority in physical and
electrical properties such as wide bandgap, high saturation
velocity, high breakdown voltage, and high thermal
conductivity. Recently high power and high performance
high frequency 4H-SiC bipolar transistors have been
fabricated with demonstrated differential dc gain levels of
about 20 to 25 at room temperature [1,2,3]. Researchers and
investigators have fabricated and used 4H-SiC in a variety of
device applications such as high performance bipolar
transistors [4, 5]. In order to develop and design devices
H. Z. Fardi is with the Electrical Engineering Department University of
Colorado at Denver and Health Sciences Center, Campus Box 110, P.O. Box
173364, Denver, CO 80217-3364, email: fardi@ieee.org , phone: 303-5564938, fax: 303-556-2383.
II. DEVICE STRUCTURE

Two device structures are investigated in this study: device A,
designed for high frequency application; device B designed
for power. On the RF side, device A [2] has an active emitter
area of 5x50 m2. The emitter base spacing is 5 m. The
4H-SiC emitter is 0.2-micron thick with 2x1019 cm3 doped ntype. The base is 0.1-micron thick with 2x1018 cm3 p-type.
The n-layer 4H-SiC sub-collector with concentration of
2x1016cm3 is 3-micron thick. Device B designed for high
power high temperature electronics [5] has an active area of
It contains 43 emitter fingers, each having a
1.2 mm2.
dimension of 1186 m x 14 m doped n-type with a doping
density of about 2 x 1019 cm-3. The emitter is 0.8 m thick.
The base emitter spacing is 3 m thick. The base is p-typed
doped with a density of 4.1 x 1017cm-3. The sub-collector
doping density is about 8.5 x 1015cm-3 and 12 m thick. The
schematic cross sectional view for these two devices is similar
and is shown in Figure 1. A summary of the device geometry

and the dimensions, including the doping levels for the two
devices A and B is given in Table I.
Fig. 1. Schematic of 4H-SiC bipolar transistors represents device A taken

from [1] and device B taken from [4]. The geometry and doping data and
data related to the structure and dimensions are given in Table I.
TABLE I
DEVICE GEOMETY AND DOPING LEVELS FOR THE TWO DEVICES STUDIED
Parameter
Device A [2]
Device B [5]
Emitter active area
5 x 50m2
1.2 mm2
Emitter Width
5 m
7m
Emitter base spacing
5 m
3m
Emitter thickness (m)
0.2 m
0.8m
Emitter doping ( cm -3)
2 x 10 19
2 x 10 19
Base thickness
0.1 m
1 m
Base contact
density ( cm -3)
1 x 10 19
8 x 10 19
Base doping
density( cm -3)
2 x 10 18
4.1 x 10 17
Sub-Collector length
3 m
12 m
Collector doping
density ( cm -3)
2 x 10 16
8.5 x 10 15
III. DEVICE MODELING

The two-dimensional (2D) device simulator PISCES-IIB is
used for modeling 4H-SiC bipolar transistor [10].
This
device simulator solves the drift- diffusion partial differential

equations and Poissons equation self-consistently for the
electric potential, and the electron and hole concentrations.
The simulation program was designed primarily for Silicon
devices; however, material parameters can be modified to
incorporate other semiconductor materials. In our case,
libraries of important known material parameters for 4H-SiC
were assembled from the literature [3,6,11]. Physical models
incorporated in the simulation include Shockley-Read-Hall
(SRH) for modeling the leakage current, surface generationrecombination mechanisms, mobility models, and ionized and
neutral impurity effects as well as velocity-field relationship.
The room temperature effective density of states in the
conduction band and in the valance band are calculated based
on the results obtained in reference [6].
The mobility parameters for 4H-SiC are similar to those of Si
material taking into the consideration the doping dependency,
high field and saturation velocity at a given temperature
[6,9,11]. The room temperature (T0) doping level dependency
of 4H-SiC is based on the following equation [12]:
Bi (N ) =
min,i + max,i (
N g ,i
N
max,i min,i
) i
(1)
T =T0 =300K
where i is for electrons (n) and holes (p). The values of the
constants max,i , min,i N g ,i , and i are presented in Table
II, taken from the literature approximated for the 4H-SiC
material [13-16]. In relationship with Equation (1), the
temperature dependence mobility is modeled by:
T
Bi (N).( )i
T0
i (N,T) = max,i (T0 )
T
1+ Bi (N).( )i +i
T0
Where T0 is the room temperature and
(2)
i ( N , T )
is the
doping (N) dependent carrier mobility. The values of the

constant i and i are given in Table II.
TABLE II
MOBILITY DATA USED IN 4H-SIC TRANSISTORSA
Parameters
Electrons
Holes
880
117
max,i (cm2 V-1s-1 )
950
30
33
2
-1 -1
min,i (cm V s )
40
1 x 1019
2 x 1017
N g ,i (cm-3)
2 x 1017
0.67, 0.76
0.5
i
i
i
a Data
2.6
0.5
are taken from [6,7,13,14].
PISCESII-B is a drift-diffusion device modeling program,

which uses both empirical formulation as well as real physical

equations for the temperature dependent material parameters.

The general expression adopted for temperature dependent
bandgap energy (Eg) for Si devices is based on the
experimental values of the energy bandgap using the
following empirical fitted relationship [17]:
aT 2
,
T +b
(3)
IV.
Where is the energy gap at zero Kelvin. Since no reports exist

on 4H-SiC bandgap temperature dependence, it is assumed
that 4H-SiC has the same temperature dependent as that of
6H-SiC , where constants Eg (0) = 3.359 eV, a=3.3 x 10 -4
eV/K and b=0 are assumed. This assumption has shown to be
useful in characterization of 4H-SiC BJT [8].
The temperature dependent carrier lifetimes ( n, p ) and the
effective density of states( N c ,v ) are modeled by the following
theoretical equations assuming that the phonon scattering is
the dominant process at or above the room temperature[18]:
T
n, p = n, p (T0 )
T0
T
N c, v = N c, v (T0 )
T0
(4)
1.2
Simulated
1.0
(5)
where the constants = -0.5 and = +1.5 are assumed.

The relationship n = p is used. The above equations (4)
and (5) are empirical mathematical derivations used to
simplify the numerical modeling [19, 20].
The ionization energy for the n-type layers using Nitrogen is
about 100 meV and for the p-type layers using Aluminum is
about 220 meV [1,2,3]. A SRH recombination carrier lifetime
of 50 ns at room temperature is used in the simulation which
fits the experimental measured gain of about 20 [19,21,22,23].
It should be noted that recently the device performance and
processing of 4H-SiC have been enhanced considerably and
much higher dc gain is achieved [24].
The material and device parameters used in the simulation
are summarized in Table III. The two devices have been
modeled using the same set of input parameters given in
Tables II and III.
TABLEI II
PARAMETERS FOR NPN 4H-SIC TRANSISTORS AT ROOM TEMPERATURE
Parameters
Value
Ref.
Saturation velocity
Nc (300 K)
Nv (300 K)
SRH lifetime
Surface-recom. velocity
RESULTS AND DISCUSSIONS
In Figure 2, the maximum dc gain as a function of temperature

is simulated for the device A and is compared with the
measured data [1]. The room temperature dc gain is about 20.
The gain decreases as the temperature increases as less current
is available at collector region.
The discrepancy at
temperatures above 500K is attributed to both the accuracy of
physical parameters and the series resistance at the p-type base
contact that prevented graphing the absolute values rather than
the normalized. A simplified analytical transport model of
bipolar transistor predicts a similar gain-temperature
relationship [6].
G a in ( N o r m a liz e d )
E g (T ) = E g (0)
The focus of this work is not to extract any physical device

parameters from the numerical simulation, but rather to gain
insight into the device behavior and to show the trend in the
dc gain performance important for the design and optimization
of 4H-SiC bipolar transistors operating at or above the room
temperature.
3 x 107 cm/s
1.64 x1019cm
3.22 x1019cm3
50 ns
1x105 cm/s
1x104 cm/s
[6]
[6]
[6]
[23]
[21]
[22]
Measured: Device A
0.8
0.6
0.4
0.2
0.0
1.0
1.5
2.0
2.5
3.0
3.5
-1
Temperature 1000/T (k )
Fig. 2. The simulated maximum current gain versus 1000/T
compared with the measured experimental device A [1].
The differential base resistance is an important parameter
for bipolar transistors at high frequency application. In bipolar
junction transistors, the emitter prevents the diffusion of
majority carriers from base (holes) into the emitter, which
results in high electron injection efficiency. The differential
base resistance is obtained from the inverse slope of the base
current (Ib) at a given emitter-base bias (VBE) and temperature
in low injection region, Figure 3. The high doping density of
the base allows the base resistance to decrease without
significantly sacrificing the emitter efficiency. As shown in
Fig. 3, a similar trend is observed experimentally [1,2,3]. The
measured base resistance in device A remains relatively
constant at high temperature. The comparison shows the
overall strength of the device modeling in predicting the

Base Resistance (Norm alized)
device behavior, making it a useful tool for optimization and

design reducing the costly iterative experimental procedure.
1
Simulation
Measured-Device A
0.8
[5]. Both the room temperature and the data for T=423 K are
shown. The maximum current gain simulated for the device B
is about 25 at room temperature and about 15 at T=423 K
compared with the measured data. The maximum dc gain
simulated is in close agreement with the experimental result.
The measured gain is somewhat lower than the simulated
result at low collector current (low injection) indicating
leakage current in measured data.
0.6
Simulated T=300 K
0.2
Measured: Device B T=423 K
30
Simulated T=423 K
0.4
25
4
20
-1
G a in
Temperature 1000/T (k )
Fig. 3. The simulated base resistance versus 1000/T compared with the
measured experimental device A [1].
15
10
Figure 4 shows the measured dc current gain over a range

of emitter-base voltage for device A [1]. The collector-base
bias was kept at a constant voltage of 10 V (The base-emitter
bias was probed so that the differential base resistance can be
measured directly from Ib-VBE data). As shown in Fig. 4, the
maximum current gain simulated for the device A is about 20
which is comparable with the measured data.
5
0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Collector Current (A)

Fig. 5. The simulated current gain for the device B is compared with the
measured data [4]. The junction becomes more non-ideal as the temperature
increases. The maximum current gain of 25 at room temperature and about 15
at T=423 K are in agreement with the simulated results.
20
G a in
16
The leakage current is shown by simulating the base current

for Device B using SRH recombination model. The base
current as a function of base-emitter bias for device B at two
different temperatures is shown in Figure 6. These results
predict that the SRH recombination current at low-level
injection may result in the degradation of the transistors gain
(shown in Figure 5). That is, in the base region, SRH
recombination rate would be high, reducing the number of
carriers that would otherwise be available at the collector
region, resulting in lower gain values.
12
8
Measured: Device A
Simulated
4
0
0
10
15
20
Base Emitter Voltage (V)

Figure 4. The measured current gain for the device A [1,2,3] is compared
with the simulated results (solid lines) as a function of the emitter-base bias,
T=300 K. The collector-base voltage was kept at a constant voltage of 10 V.
The measured differential dc gain as a function of collector

current density for the power device B is shown in Figure 5
Also shown in Figure 6 is the simulated turn-on voltage

compared with the experimental data at two different
temperatures for the device B. The measured data is obtained
by other investigators [3,5]. The turn-on voltage decreases as
the temperature increases. This can be explained by the diode
property of the junction. The drift-diffusion model employed
predicts accurately the junction current voltage characteristic
at high injection.

Ib Measured: Device A
Ic Measured: Device A
Ib Simulated
0.10
Ic Simulated
1.E-01
0.09
1.E-02
Simulated T=300 K
0.07
0.06
Simulated: T=423 K
1.E-03
C u rre n t (A )
Current Ib (A)
0.08
Measured:Device B T=423 K
0.05
0.04
1.E-04
1.E-05
1.E-06
1.E-07
1.E-08
0.03
1.E-09
0.02
1.E-10
0
0.01
10
0.00
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0

Fig. 6. The turn-on voltage for the device B at two different temperature
T=300 K and T=423 K. The measured data are also shown [4].
The simulated room temperature base current and collector

current (Ic) are shown in Figure 7 as compared with the
measured experimental data for device A [1] The transistor
starts to show gain at base emitter voltage of about 3 Volts .
The simulation model predicts lower base current and at
higher base emitter voltage than the experimental data. The
discrepancy in base current at low base emitter bias is
attributed by Perez et al. to the leakage current caused by the
incomplete removal of the emitter epitaxial layer on top of the
extrinsic base as reported in detail in the original experimental
work [1] In a later work, they report on enhancing the device
fabrication process that resulted in better device performance
[24]. The physical drift-diffusion model applied in the
simulation is relatively accurate for most of the device data at
the high injection that is dominated by high series resistance.
In this regime, the physical model normally tends to fail to
predict accurately the current-voltage relationship since the
series resistance dominates and the model tends to fail to
predict the current-voltage characteristics. At low injection,
the current-voltage characteristics are modeled by a relatively
high SRH recombination rate with a carrier lifetime of 50 ns
that may be at the high end [23]. The simulation results
indicate that higher dc gain can be achieved if the leakage
current is reduced.
Fig. 7. The simulated room temperature base current and collector current for
the device A as compared with the measured data [1,2,3]. The measured base
current shows leakage current. A maximum dc gain of about 20 is obtained for
this device at room temperature. High level injection is dominated by series
effect.
V.
CONCLUSION
A 2D drift-diffusion simulation program is utilized to model

4H-SiC bipolar transistors used in both power electronics and
high frequency high temperature applications. The 4H-SiC
model parameters are applied to an existing device simulator,
PISCES IIB, developed primarily for Silicon transistors. The
simulated results are obtained for the base resistance,
differential dc gain, and the current voltage characteristics
over a wide range bias and temperature. The comparison
made with the measured experimental data confirms the
accuracy of the modeling established over the range of
temperature investigated. The simulation results predict a
maximum dc gain of about 20 for device A and about 25 for
device B in agreement with the measured data. The modeling
results show that a higher dc gain can be achieved if leakage
current is to be minimized for 4H-SiC BJTs.
REFERENCES
[1]
[2]
[3]
[4]
[5]
Perez-Wurfl, I. et al. 4H-SiC bipolar junction transistor with high

current and power density, Solid State Electronics, vol. 47, pp. 229231, 2003.
Perez-Wurfl, I., Konstantinov,A., Torvik, J., and Van Zeghbroeck, B.,
RF 4H-SiC Bipolar Transistors, Proceedings of the Leste rEastman
Conference, Newark, DE, Aug. 6-8, 2002.
Perez-Wurfl, I., Torvik, J., and Van Zeghbroeck, B., 4H-SiC RF
Bipolar Junction Transistors, Proceedings of DRC, pp. 27-28, 2003.
Huang C.-F., Cooper, J. A. Jr., high performance power BJTs in 4HSiC, IEEE 7803-7447, 2002.
Zhao, J. H., et al., A high voltage (1750V) and High Current gain
(=24.8) 4H-SiC Bipolar Transistor using a thin (12 m) drift Layer,
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]

ICSRM manuscript no. 470, session ThP3-14, Lyon, France, Oct. 5-10,
2003.
Danialsson, E., Processing and electrical characterization of GaN/SiC
heterojunctions and SiC bipolar transistors, ISRN KTH/EKT/FR01/1-SE, KTH, Royal Institute of Technology, Stockholm, 2001.
Danielsson, E., et al., Solid State electronics, vol. 47 , p. 639, 2003.
Li, X, et al., on the temp. coefficient of 4H-SiC BJT current gain,
Solid State Electronics, vol. 47, pp. 233-239, 2003.
Mnatsakanov,T.T., et al., carrier mobility model for simulation of
SiC-based electronic devices, Electronic Journal: Semiconductor
Science and Technology, 17, No. 9, pp. 974-977, 2002.
Pinto, M. R., Rafferty, C. S., Dutton, R. W., PISCES II: Poisson and
Continuity Equation Solver, Integrated Circuits laboratory, EE
Department, Stanford University, CA., 1984.
Raghunathan, R., and Baliga, B. J., p-type 4H and 6H-SiC HighVoltage Schottky Barrier Diodes, IEEE EDL, vol. 19, no. 3, pp. 7173, 1988.
Caughey, D. M., and Thomas, R.E., Proceedings of IEEE, vol. 55, p.
2192, 1967.
Mnatsakanov T T, Pomortseva L I and Yurkov S. N., Semiconductors
35 394, 2001.
Roschke M and Schwierz F., IEEE Trans. Electron Devices 48 1442,
2001.
Morkoc, H., et al., Large-band-gap SiC III-V nitride, and II-VI ZnSebased semiconductor, J. Appl. Phys. 76 (3), 1, pp. 1363-1398, 1994.
Joshi, R. P., Monte Carlo calculation of the temperature- and fielddependent electron transport parameters for 4H-SiC, J. Appl. Phys.
78(9), pp. 5518-5521, 1995.
Thurmond, C. D.,The standard thermodynamic for the formation of
electrons and holes in Ge, GaAs, GaPs, J. Electrochem., Soc., vol.
122, pp. 1133-1141, 1975.
Sze, S. M., Physics of Semiconductor Devices, John Wiley & Sons,
New York, 1981.
Arora, N. D., Hauser, J. R., and Roulston, D.J., IEEE Trans. ED, vol.
29, p. 292, 1982.
Ayalew, Tesfaye ,SiC semiconductor Devices technology, Modeling,
and Simulation, PhD Thesis, Technischen Universitat Wien, 2004.
Galeckas, A., et al., Appl. Phys. Lett., 79, 365, 2001.
Ivanov, P. A., et al., Factors, limiting the current gain in high-voltage
4H-SiC npn BJTs, Solid State Electronics, vol. 46, pp. 567-572, 2002.
Fardi, H. Z., Modeling the dc gain of 4H-SiC Bipolar Transistors as a
function of surface recombination velocity, Solid State Electronics,
2005.
Van Zeghbroeck Bart, Perez, I., Zhao, F., and Torvik, J., Technology
development of 4H-SiC BJTs with 5GHx fMAX, CS MANTECH
Conference, pp. 223-226 ,Vancouver, Canada, April 24-27, 2006.
H. Z. Fardi received the B.S. degree in Physics from Tehran

University, Iran in 1978, and the M.S. and Ph.D. degrees in
electrical engineering from the University of Colorado at
Boulder in 1982 and 1986, respectively. He joined the
University of Colorado at Denver in 1992 where he is now a
professor of electrical engineering. Prior to this, he was with
the National Science Foundation Center for MillimeterMicrowave Computer Aided Design of the University of
Colorado at Boulder. He has worked on numerous research
projects related to physics and modeling of novel
semiconductor devices and solid state electronics.
His
research work on physics and characterization of indium
phosphate photovoltaic devices, gallium arsenide field effect
transistors, pnpn thyristor light emitting diodes, and silicon
sarbide bipolar transistors are published in 42 refereed
national and international technical journals and proceedings.
He received the Western Universities Fellowship awards thru
National Renewable Energy Laboratory in 1996 and 1997
working on multi-quantum well superlattices. Dr. Fardi was
the recipient of the Researcher of the Year Award form the

College of Engineering of University of Colorado at Denver
in 1997.
Dr. Fardi is a senior member of the Institute of Electrical
and Electronics Engineers (IEEE) and the Treasurer of
Electron Device Society-Denver Chapter.
Regular Paper
7

Yun Zhang et al.: Evaluation of Typical Prediction Structures for Multi-view Video Coding
Evaluation of Typical Prediction Structures

for Multi-view Video Coding
Yun Zhang, Mei Yu, and Gangyi Jiang
Abstract The main requirements of the prediction

structure for multi-view video coding (MVC) are that they
have to provide high compression efficiency, operate with
reasonable complexity and memory requirements, and
facilitate random access. Up to now, many MVC prediction
structures have been proposed mainly for obtaining high
coding gain, however there are more and more importance is
attached to MVC schemes functionalities, such as random
accessibility, coding or decoding complexity, and scalability.
It is desirable to make a comprehensive evaluation of the
schemes that contributes to further understanding over MVC
prediction techniques and the corresponding MVC schemes.
In this paper, we analyze nine typical MVC prediction
structures in six categories at length, and then experimentally
compare their performances, including complexity, random
accessibility, and scalability as well as compression
efficiency. Finally, we discuss the trade-off between the
compression related requirements for further researches on
MVC. Generally, MVC with hierarchical B picture is the
most efficient scheme in compression, but results in high
complexity and low random accessibility. And simulcast
prediction structure is most suitable for the applications
without strong storage limitations but with low-delay or realtime requirements. Other schemes have a trade-off between
compression efficiency and functionalities by adopting
multiple intra frames, sequential prediction, multiple
reference prediction and intra frame centered structure etc1.
Index Terms Multi-view Video Coding, Prediction

Structure, Random Access, Coding and decoding Complexity,
Scalability.
I. INTRODUCTION
Multi-view video capturing, analysis, coding and display
have attracted a lot of attention in recent years since three
Manuscript received on August 1, 2007. This work was supported by Natural
Science Foundation of China (grant 60472100, 60672073), the Program for
New Century Excellent Talents in University (NCET-06-0537), the Key Project
of Chinese Ministry of Education (grant 206059), and the Natural Science
Foundation of Ningbo China (grant 2007A610037).
Yun Zhang was with the Faculty of Information Science and Engineering,
Ningbo University, Ningbo, China, he is now with the Institute of Computing
Technology, Chinese Academic of Science and Graduate School of Chinese
Academic of Science, Bejing, 100080, China (email: zhangyun_8851@163.
com)
Mei Yu is with the Faculty of Information Science and Engineering, Ningbo
University, Ningbo, 315211, China (e-mail: yumei2@yahoo.com.cn). She is
corresponding author of the manuscript.
Gangyi Jiang is with the Faculty of Information Science and Engineering,
Ningbo University, Ningbo, 315211, China (e-mail: jianggangyi@126.com).
dimension television is expected to be the next killer

application of digital media technologies in the coming
decade. However, there are still quite a lot of technological
difficulties needed to be overcome. Due to the massive mount
of data of the multi-view video, data transmission and
processing requires much more bandwidth and computational
power than that of mono-video. Thus, Multi-view video
coding (MVC) [1]-[3] has become one of the key techniques
for multi-view video applications.
Many MVC schemes have been proposed, including MVC
with simulcast[4], sequential view prediction structure
(SVPS)[1], group-of-GOP coding structure (GoGOP)[5][6],
MVC with multi-directional picture (M-picture) [7][8] and
MVC with hierarchical B picture (HBP)[9][10] etc.. SVPS
adopts view-by-view prediction in order to achieve relatively
high coding efficiency by relieving occlusion and exposure.
MVC with M-picture prediction structure introduces a new
type of picture, M-picture, which supports 21 coding modes,
for high coding gain. MVC with HBP prediction structure
introduces HBP picture which significantly improves
compression efficiency. These researches on MVC mainly
focused on rate-distortion (RD) performance, because high
compression
efficiency
is
demanded
by
strong
transmission/storage limitations of available networks and
storage devices.
On the other hand, nowadays, there are more and more
importance attached to MVC schemes functionalities, such
as random access (RA), coding or decoding complexity, and
scalability. Interactive multimedia applications, such as free
viewpoint video communication, will let the user freely
change viewing position and direction while downloading
and streaming a video content. Therefore, fast RA in view
and time dimension is a key performance of MVC[11]-[13].
GoGOP coding structure is mainly proposed to improve RA
by adopting multiple intra frames in 2D GOP. In addition,
MVC schemes are also required to provide additional
compression related requirements for functionalities, such as
low delay encoding-decoding, spatial/temporal/SNR/ view
scalability low memory and complexity requirement[2][3].
These requirements should be jointly taken into account to
evaluate MVC schemes.
The rest of the paper is organized as follows, in section II
and III, structure related requirements and nine typical MVC
coding schemes in six categories are discussed, respectively,
and then, comparisons among the performances of the typical

MVC prediction structures are presented and analyzed in

section IV. Finally, a conclusion is given.
II. REQUIREMENTS FOR MVC
In addition to high compression efficiency, an excellent
prediction structure for MVC should operate with reasonable
complexity, memory requirements and fast RA, and provide
scalability and so on.
A. Random Accessibility
MVC schemes should support RA in view and time
dimension and RA to a spatial area in a picture[3]. RA
directly affects the system capabilities that support view
switching, view sweeping and frozen moment[14] etc. Two
cost parameters, average and maximum path lengths of
random viewpoint access, can be used to describe random
accessibility of MVC scheme [10][15]. The smaller the costs
are, the better RA a MVC scheme supports. Fig.1(a) shows
the simulcast prediction structure, in which each view is
coded independently. The group of multi-view video can be
thought as a two-dimensional (2-D) coding cell or a matrix,
in which each element is a picture. The horizontal axis
represents views, and the vertical is time axis. Each rectangle
represents a frame and the arrow point to the compulsory
reference picture.
View
View
P
P
P
Time
Time
Fig.1 (a) Simulcast
Fig.1 (b) IPPP
Let xi be the number of frames that must be pre-decoded

before the ith frame being decoded in a coding cell with Mview
views and Ntime time-steps. Let pi be the probability for the ith
frame to be accessed by viewer. The average path length of
random viewpoint access [14], Fav, is defined by
E( X ) =
N time M view
x p
i
(1)
B. Computational Complexity
In H.264/AVC based MVC coding platform, a good
portion of the gain comes from high precise disparity
compensation prediction (DCP) and motion compensation
prediction (MCP), however, at the cost of high coding
complexity and memory consumption. MCP or DCP occupies
most of coding time, so in this paper, we estimate the
computational complexity of a MVC prediction structure by
using the minimum number of reference frames of a coding
cell, PNmin. (Note that the coding structures could expand the
number of referenced frames for high compression
efficiency.) The lager PNmin is, the more complicated the
encoder is. For example, simulcast scheme, as illustrated in
Fig.1(a), adopts P frames that reference at least one frame in
the encoding procedure. Therefore, its PNmin is 30.
C. Memory Requirement
General memory requirements stem from AVC, since all
frames from all views are reordered and treated as one single
video stream by the coder. And the decoded picture buffer
(DPB) in H.264/AVC is used to store the reference frames.
Assume that each scheme adopts the optimal coding order to
minimize the DPB size, represented by DPBmin here. For the
simulcast prediction structure, since P frames are singlereferenced and are coded view-by-view, only one frame has to
be stored in DPB for next frame, i.e. DPBmin equals to 1.
D. View Scalability
View scalability enables the video to be displayed on a
multitude of different displayers, such as multi-view
displayer, stereo displayer or HDTV, and terminals over
networks with varying conditions[3]. Therefore, the decoder
should be able to selectively decode the appropriate number of
views according to the type of display modes with view
scalability. In this paper, we define two cost variables, FSV,
FDV, to represent the average number of compulsorily decoded
frames in a coding cell when single view or double views are
displayed, respectively.
Let On be a set of frames in a coding cell and Xi,j be a set of
compulsory decoded frames when the frame at (i,j) position in
a coding cell is displayed. Thus X i , j On . Suppose j is the
probability for jth view to be chosen to display by the user, and
j,k is the probability for both jth view and kth view to be
accessed. FSV and FDV are defined as
i =1
And the maximum path length of random viewpoint access is

defined as follows
Fmax = max { xi 0 < i M view N view }
(2)
FSV =
FDV =
Ntime
Card U X i, j j
j =1
i =1
M view
M view M view
Ntime
Card U ( X
j =1 k = j +1
i =1
i, j
U X i ,k ) j ,k
(3)
(4)
where Card is cardinality of a set. On the other hand, the

sender is just required to send the compulsorily decoded
frames so as to save transportation stream bandwidth as well

as decoders computation power. The smaller the cost
variables are, the better view scalability the decoder supports.
Besides the requirements discussed above, there are other
compression related requirements and system related
requirements should be supported by the multi-view video
system. More details are available in [3].
View
View
A. Simulcast
As a reference of evaluating MVC schemes compression
efficiency, simulcast coding structure was analyzed based on
H.264/AVC coding platform, in which each view in multiview video is coded independently [4]. The first frame of each
view is coded as an intra frame (I-frame), the remaining
frames are coded as P-frames, as shown in Fig.1(a). Simulcast
coding can be achieved by independently coding the coding
cell column by column. The simulcast coding structure is a
direct expansion of mono-video coding scheme.
On account of the deficiency of simulcasts RD
performance, Sun et al have proposed a MVC scheme based
on DCP, denoted as IPPP, as shown in Fig.1(b). It is clear
that the key frame of the center view in each group of GOP is
coded as I-frame, while other key frames are coded using
inter-view DCP from center to both sides. The structure
enhances coding efficiency by eliminating inter-view
redundancy of the first time of a coding cell, while maintains
relatively fast RA.
P
P
Time
III. TYPICAL MVC PREDICTION STRUCTURES
Time

Fig.2 (a) SVPS using P frames
Fig.2 (b) SVPS using B frames
Fig. 3(a) GoGOP AB
Fig. 3(b) GoGOP SR

Pt
Bts
Bts
Bts
Bts
Bt
Bts
Bts
Bts
Bts
Bt
Bts
Bts
Ps
Ps
Ps
Ps
Bts
Bts
Bt
Bts
Bts
Bts
Fig. 3(c) GoGOP MR
Bts
Bts
Bt
Bts
Bts
Bts
B. Sequential View Prediction Structure [1]

Fig.2(a) illustrates a sequential prediction structure for
MVC, where the video sequence from the first camera video
is encoded normally using temporal prediction. Then, each
frame in the second viewpoint is predicted using the
corresponding frame in the first viewpoint, as well as with
temporal prediction from other frames in the second
viewpoint. Next, the third viewpoint sequence is predicted
using the second viewpoint sequence, and so on. There may
be several variations of this sequential approach. For
example, as illustrated in Fig.2(b), bi-prediction may be used
to increase coding efficiency. In this case, bi-predicted
pictures in certain views would be stored so that neighboring
pictures in different views could use these pictures as a
reference for prediction.
In Fig.2(b), there exist five types of frames, they are I
(pure intra coded), P' (single view prediction, use only I or
other P' as reference), P (single temporal prediction, use only
I or other P as reference), B' (bi-predicted temporally and
spatially, uses at most two references of P, P' or other B'
frames), and B (also bi-predicted temporally and spatially, but
uses three references including stored B from other preceding
view).
Bts
Bts
Pt
Bts
Bts
Fig. 4 Lims scheme with one I-type
C. Group-of-GOP Coding Structure

NTT Corp. proposed a MVC scheme with the GoGOP
structure, which is capable of decoding a view in low delay
[5][6]. The GoGOP is the extended structure from GOP and
provides the low-delay RA in view dimension as well as in
time dimension. In GoGOP, all GOPs are categorized into
two, Base-GOP and Inter-GOP. A picture in a Base-GOP may
reference decoded pictures only in the current GOP, however,
a picture in an Inter-GOP may use decoded pictures in other
GOPs as well as in the current GOP.
The method to code all pictures as Base-GOP, is called
GoGOP AB (All in Base-GOP), as illustrated in Fig.3(a). It is
proposed to achieve lowest delay though deficient in
compression efficiency. To improve coding efficiency, other
two advanced schemes are proposed. One is the Single
Reference (SR) Inter GOP coding method, denoted as GoGOP
SR. In GoGOP SR, a picture in an Inter-GOP refers to
previous and current pictures in Base-GOPs and previous
pictures in the same GOP, as illustrated in Fig.3(b).
Extending the structure of GoGOP SR, the Multiple
Reference (MR) Inter GOP coding method, denoted as
GoGOP MR, was proposed. In GoGOP MR, a picture in an

10
Inter GOP refers to previous and current pictures in other

GOPs (Base-GOP and Inter-GOP) and previous pictures in
the current GOP, as illustrated in Fig.3(c).
D. Lims MVC Scheme
Lim et al proposed a multi-view sequence CODEC with
flexibility and view scalability[15]. The encoder generates
two types of bitstreams, a main bitstream and an auxiliary
one. The main bitstream is coded using temporal prediction.
The auxiliary bitstream contains information concerning the
remaining multi-view sequences except for the reference
sequences. Fig.4 illustrates one kind of Lims prediction
structure with one I-frame, named One-I. In the prediction
structure five kinds of frames are defined, they are I-frame,
Pt frame and Bt frame for removing temporal redundancy
using MCP, Ps frame and Bs frame for view redundancy
using DCP, and Bs,t frame predicted from temporal and
spatial dimension simultaneously for removing redundancy in
both temporal and spatial domain. The I-frame is centered in
the coding cell for fast RA. In Fig.4, the middle view of the
matrix is encoded into the main bitstream, and the rest views
are coded into auxiliary bitstream. The codec are selectively
determined at the receiver according to the type of display
devices and display modes. The viewers can choose an
arbitrary number of views by checking the information so that
only the views selected are decoded and displayed. For higher
compression efficiency, we implemented Lims prediction
structure on H.264/AVC platform for experimental
comparison.
I0
B2
b4
B3
b4
B3
B2
B1
B2
B1
B3
b4
B3
b4
B3
B2
B3
B2
B3
B2
B3
b4
B3
b4
B1
P0
B1
P0
Fig. 5 MVC with M-picture
F. MVC with Hierarchical B Picture

HBP significantly improves RD performance when
quantization parameters (QP) for the various pictures are
assigned appropriately [17]. Additionally, HBP provides
hierarchical temporal scalability. Based on the statistical
correlation analysis of multi-view video, MVC with HBP is
proposed for the case of a 1D camera arrangement (linear or
arc). Later, it is extended to support 2D camera arrangement
(cross or 2D-array) [10]. Fig.6 shows the prediction structure
for 1D camera arrangement. In the figure, rectangles labeled
Blevel are the HBPs with different levels, where level
indicates the different coding level of B picture. And the
higher-level B pictures are allowed to reference from the
lower-level pictures only.
The inter-view/temporal prediction structure in Fig.6
applies HBP in temporal and inter-view dimension. For
H.264/AVC compatible encoder, the multi-view video
sequences are combined into one single uncompressed video
stream using a specific scan[9]. This is a pure encoder
optimization, and the only change to the encoder is the
increase of the decoded picture buffer size to 2 Ntime + Mview
to store all necessary images, and a potentially larger number
of output pictures per second than it is currently allowed in
H.264/MPEG4-AVC.
The MVC with HBP outperforms other approaches in
compression efficiency significantly and provides multi-level
temporal scalability. In addition, the multi-view video data
are reorganized into a single uncompressed video stream that
is fed into standard H.264/AVC encoder, thus the resulting
bitstream is H.264/AVC compatible.
B3
I0
B3
B2
B3
B3
P0
b4
B1
B1
B3
B3
P0
b4
B2
B1
B3
The introduction of M-picture improves compression

efficiency because multidirectional prediction enables the
coder to encode pictures with fewer residual errors and bits.
Furthermore, a new cost function has been proposed for
effective mode selection that maximizes the coding efficiency
of the coder.
Fig. 6 MVC with HBP
E. MVC with Multi-directional Picture

Oka and Fujii proposed a new picture type for coding
dynamic ray-space named multidirectional picture, which
utilizes inter-image prediction in both temporal and spatial
domains [8]. M-picture has 21 coding modes classified into
five categories for high compression efficiency. Fig.5
illustrates MVC with M-picture prediction structure in a
coding cell with five views and seven temporal intervals.
Rectangles labeled M denote the multidirectional pictures.
Each M-picture are multi-referencing at least four
surrounding B or P pictures.
IV. EXPERIMENTAL RESULTS AND ANALYSES

A. Compression Efficiency Comparison
In order to evaluate compression efficiency of the nine
MVC schemes, the MVC prediction structures presented in
section 3 are realized using an H.264/AVC encoder with
extended memory capabilities. And multi-view video
sequences, from KDDI, MERL and Tanimoto Lab, with
varying content, different spatial-temporal density and
different resolutions, such as flamenco1, race2, golf2,
vassar, Aquarium and xmas with 30mm camera interval,
have been coded. Information of the sequences is specified in
TABLE I. Encoding parameters are illustrated in TABLE II.
And it is important to point out that, in our experiments, I,
P, B and M pictures adopt a consistent QP in a coding

process and the coders use the reference frames as few as

possible for low-complexity.
Fig.7 illustrates the PSNR-Y over bit-rate averaged over all
views of a data set. Here, HBP depicts MVC with HBP
coding structure, and BSVP and PSVP denote SVPS using
P and B pictures, respectively. GoGOP SR and GoGOP
MR depicts the two GoGOP coding structure. Jeong
represents the MVC schemes proposed by Lim et al, and
Mpicture depicts the MVC with M-picture coding scheme
proposed by Fujii et al. Additionally, simulcast coding
scheme and the IPPP coding structure proposed by Sun et al
is represented by Simulcast and IPPP, respectively.
46
38
36
20cm
640480
640480
320240
19.5cm
30mm
3cm
Camera
distance
320240
Simulcast
IPPP
Jeong
PSVP
BSVP
Mpicture
HBP
GoGOP SR
GoGOP MR
40
TABLE II
CODING PARAMETER
Platform
JM8.5 main profile
RDO
Yes
Entropy Coding
Loop filter, CABAC
Search range
32 for CIF/VGA
Motion/Disparity Compensation 1/4Pixel Accuracy, Full Search
QP
18, 24, 30, 36
0.1
(b)
0.3
0.4
0.5
0.6
0.7
RD performance for flamenco1 sequence
Aquarium
Aquarium
42
40
38
Simulcast
IPPP
Jeong
PSVP
BSVP
Mpicture
HBP
GoGOP SR
GoGOP MR
36
34
32
30
28
0.5
(c)
1.5
Entropy (Bit/pixel)
Entropy(Bit/pixel)
2.5
RD performance for aquarium sequence
Xmas
Xmas
46
golf2
golf2
46
42
44
Average PSNR(dB)
Average PSNR(dB)
44
Average PSNR(dB)
Average PSNR(dB)
0.2
Entropy(Bit/pixel)
Entropy (Bit/pixel)
44
Average PSNR(dB)
Average PSNR(dB)
flamenco1 Slow movement, varying chroma

race2
Cars move fast
golf2
Cameras move horizontally, slow
vassar
Big static background
xmas
Dense linear camera arrangement
aquarium
Arc camera arrangement
42
32
Resolution
Video characteristics
44
34
TABLE I
T EST S EQUENCES FOR M ULTI - VIEW V IDEO
Data Set
flamenco1
flamenco1
48
Average PSNR(dB)
Average PSNR(dB)
11
40
Simulcast
IPPP
Jeong
PSVP
BSVP
Mpicture
HBP
GoGOP SR
38
36
34
GoGOP MR
32
0.1
0.2
0.3
0.4
0.5
0.6
Entropy (Bit/pixel)
Entropy(Bit/pixel)
(a)
RD performance for Xmas sequence
0.7
42
40
Simulcast
IPPP
Jeong
PSVP
BSVP
Mpicture
HBP
GoGOP SR
GoGOP MR
38
36
34
32
0.1
(d)
0.2
0.3
0.4
0.5
Entropy (Bit/pixel)
Entropy(Bit/pixel)
0.6
RD performance for golf2 sequence
0.7

12
Average PSNR(dB)
Average PSNR(dB)
domain or spatial-temporal jointly according to the spatialtemporal correlation.
race2
race2
44
42
TABLE II
PERFORMANCE COMPARISON TABLE
40
Coding
Structure
38
Simulcast
IPPP
Jeong
PSVP
BSVP
Mpicture
HBP
GoGOP SR
GoGOP MR
36
34
32
30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Entropy (Bit/pixel)
Entropy(Bit/pixel)
(e)
RD performance for race2 sequence
Fig. 7 PSNR results for the five data set
In Fig.7, it is obvious that the HBP outperforms other

schemes significantly about 1dB at all bit-rates for tested
multi-view sequences, (note that the curves are linked from 4
specified points). BSVP, PSVP, Mpicture and Jeong
schemes win the second place in coding efficiency and
outperform simulcast from 1dB to 4dB through spatial and
temporal prediction. GoGOP and IPPP curves nearly
overlap simulcast for all sparse sequences, but for the dense
xmas sequence, they outperform simulcast 1dB to 2dB.
Obviously, the compression efficiency does not only depend
on the parameters of the sequence such as camera distance
and frame rate, but also on the content of the sequence itself.
From Fig.7(b), Mpicture scheme is inferior to IPPP
scheme in RD performance because M-pictures are not
efficient enough when the chroma of flamenco1 varies in
temporal domain. SVPS, Jeong, HBP and Mpicture
obtain significant compression efficiency from spatial
prediction. HBP also obtains a good portion of the gain that
comes from the HBPs in temporal dimension.
For most multi-view sequences, the majority of correlation
is mainly in temporally preceding frames. For Aquarium,
only a smaller gain, no more than 1dB, could be achieved by
spatial and inter-view prediction because the percentage of
spatial correlation is low. However, there is also a significant
amount of dependencies that are better predicted from the
spatial direction [18]. For the sequences flamenco1, golf2,
race2, this amount is rather high that leads to the
assumption that for these sequences a significant coding gain
could be achieved by a proper multi-view prediction scheme.
Multiple reference prediction reduces temporal and inter-view
dependencies to improve compression efficiency. But for the
sequence of which correlations are concentrated on temporal
domain, a multi-reference prediction scheme is unable to
achieve more gain compared to simulcast coding. So to
achieve both high compression efficiency and low complexity,
its reasonable to predict from spatial domain, temporal
Simulcast
GoGOP SR
GoGOP MR
PSVP
BSVP
MVC with
Mpicture
MVC with
HBP*
Lims
scheme
IPPP
Random
Access Cost
Fav
Fmax
3.0
6
3.6
9
4.6
14
11.0
34
7.5
19
PNmin
DPB
min
30
111
114
58
83
1
16
16
7
7
View
Scalablity
FSV
FDV
7
14
12.6
21.7
15.4
25.2
21
28
21
28
6.0
20
97
16
16
22.4
7.6
16
96#
21
15.2
26.1
3.1
62
12.6
18.2
4.2
34
8.2
15.2
Note * represents the corresponding performance value in 58 coding structure

# represents more complexity due to recursive MCP and DCP
B. Random Accessibility Comparison

In MVC with HBP scheme, the maximum number of
reference frames Fmax is necessary for those B frames with
highest level Lmax and highest view number Mview within a
GOP. Thus, Fmax can be calculated as Fmax=3Lmax+2 (Mview
-1)/2. Applying this to the coding structure in Fig.6, a total
of Fmax=16 referencing frames have to be decoded, namely. In
TABLE III, it is obvious that Fmax is generally coinciding with
Fav for the compared schemes.
Simulcast scheme is the best one in Fav owing to single
reference prediction and multiple I-frames in a coding cell.
However, it is inferior to Lims scheme a bit in Fmax because
simulcast sequentially predicts in temporal domain of each
view. Lims scheme is excellent in RA owning to its I-frame
centered structure, although it may cost more buffers. IPPP
and GoGOP schemes are in second place in random
accessibility. Compared with simulcast, IPPP structure
enhances compression efficiency by eliminating inter-view
redundancy of the first time-step of a coding cell. It reduces Iframes and extends path length of sequential prediction.
GoGOP maintains relative fast RA thanks to multiple Iframes and its small Base-GOPs size. But frames in InterGOPs are multi-referencing, especially for frames in MR
Inter-GOPs. They not only increase complexity but also
extend the number of pre-decoded frames that causes more
delay. SVPS is the worst of all in RA. It is clear that PSVP
has to pre-decode 11 frames on average when one frame is
randomly selected to be decoded. In case of the bottom-right
frame is selected, the decoder has to pre-decode 34 frames.
Introduction of B pictures in SVPS cuts down the path length
of sequential prediction and improves random accessibility.
MVC with HBP and MVC with M-picture are not good in RA
due to multiple reference prediction.
13

C. Computational Complexity Comparison

Column PNmin in TABLE III shows the computational
complexity of the MVC schemes. From the table, GoGOP is
the most time-consuming scheme of which PNmin is more
than 110, because most frames in Inter-GOPs reference from
6 to 8 frames. MVC with M-picture, MVC with HBP and
BSVP are in the second place in complexity. Most HBPs and
all M-pictures reference from four frames, and most B frames
in BSVP reference from three frames. Lims MVC scheme
and PSVP are in the third place. PSVP mainly adopts P
frames, most of which are two-frame referencing, either
temporally or spatially according to RD cost. In Lims MVC
scheme, most frames are also two-frame referencing.
Simulcast and IPPP mainly adopts single reference P frames
that contribute to their lowest complexity.
D. Memory Requirement Comparison
Column DPBmin in TABLE III shows the memory
requirements of the compared MVC schemes. For MVC with
HBP, in the case of a coding cell with five views and seven
time-steps, DPBmin would be 21 whatever the encoding order
is. GoGOP and MVC with M-picture are also memory
consuming for DPBmin equals to 16. We attribute the three
schemes high memory requirement to multi-reference
prediction structure. For instance, all I, B and P frames have
to be stored in MVC with M-picture scheme while encoding
M-pictures. SVPS and Lims scheme are in the middle class
in memory requirement because of sequential prediction and
I-frame centered structure, respectively. Simulcast and IPPP
are the best thanks to single-reference prediction. From the
DPB size and structure analysis, we can obtain that singlereference prediction, sequential prediction and I-frame
centered structure may contribute to low memory
requirement.
E. View Scalability Comparison
Assume that the view switching among the views is a
average probability event, that is j=0.2 and j,k=0.1. Columns
FSV, FDV in TABLE III show the number of decoded frames
for displaying single view and double views, respectively. As
we can see in the table, simulcast is the best one and there is
no additional decoded frame in this scheme. IPPP needs one
or two additional frames because the frames in first time step
are coded using DCP from center to both sides, but it is still a
good structure in view scalability. The view scalability cost of
Lims scheme and GoGOP SR is generally 1.5 times of the
displayed frames which means 1.5 times computational power
of decoder and bandwidth. GoGOP MR, MVC with M-picture
and MVC with HBP require nearly two times computational
power. SVPS is the worst of all. It expenses three times cost
for single view displaying and two times decoding frames for
double view displaying. Generally, view-by-view sequential
prediction and multiple reference prediction may have burden
on view scalability of the multi-view video system.
From the results above, MVC with HBP provides extremely

outstanding compression efficiency as well as temporal
scalability that come from HBP and distinctive prediction
structure (one I frame is used by two coding cells). However,
this coding structure requires large memory (more than a
GOP.), which is disadvantageous especially for real-time
applications. Though not superior in RA, resource
consumption and view scalability, it is a MVC scheme with
high compression ratio and really good for storage-faced
applications.
Simulcast and IPPP schemes, contrary to MVC with HBP,
provide fast RA, low complexity, low memory requirements
and perfect view scalability, but their common fatal
disadvantage is that they are not good enough in RD
performance, though IPPP improves a bit when compared
with simulcast. As a result, simulcast and IPPP are
recommended in the applications without strong bandwidth
limitation but with low-delay requirements, such as multiview video system in LAN or receivers without enough
computational power and memory.
Lims MVC scheme jointly adopts MCP and DCP to
eliminate temporal and inter-view redundancies. The
improvements in better random accessibility and lower
complexity can be significantly achieved by placing the Iframe at the center of the 2D GOP. However, I-frame
centered structure will introduce some initial delay and
require more DPB size on the receiver side, so dose MVC
with HBP. Lims scheme is a well-balanced scheme among
compression efficiency, RA and complexity etc.
Though M-pictures provide high compression efficiency in
MVC with M-picture, bottom-right and up-right frames in a
coding cell are coded in low efficiency due to long-distant
inter-view prediction. It may work well for high-density
camera multi-view video sequences of which the correlations
of both domains are not so different. On the other hand, the
coder is very time-consuming owing to multiple reference
prediction and rate-distortion optimization so that it is
difficult to be applied in real-time applications. Additionally,
MVC with M-picture is poor in RA and view scalability that
are essential to multi-view video applications, such as free
viewpoint video.
SVPS takes advantage of multiple reference prediction to
eliminate inter-view and temporal redundancy. Furthermore
it relieves exposure and occlusion through view-by-view
prediction. However, the sequential prediction may cause
error propagation and poor performance in RA and view
scalability. Fortunately, the introduction of B picture may
improve RA a bit.
GoGOP structure was once proposed for providing lowdelay RA. Even if Inter-GOPs are not decoded, all views can
be obtained by decoding Base-GOPs and interpolation.
However, its RA is still not good enough, especially for
GoGOP MR. GoGOP is inferior to simulcast and Lims

14
scheme. From analysis over a larger set of multi-view video

sequences, Fecker and Mueller et al have once drawn
conclusion that temporal prediction is the most efficient
prediction mode because temporal correlation is much more
than inter-view correlation generally, though there are
differences between the data sets in scene complexity, frame
rate etc. In GoGOP, it is efficientless for Inter-GOPs
predicting from frames at different views and different timesteps relative to the current frame. Multiple reference
prediction in Inter-GOP improves RD performance slightly
but at the expense of huge computational power and RA
performance.
Laboratories, and Tanimoto Laboratory at Nagoya University have

kindly provided multi-view video test sequences. HHI have kindly
provided joint model software.
REFERENCES
[1]
[2]
[3]
[4]
[5]
V. CONCLUSION
Nine typical MVC prediction structures with six categories
were analyzed in detail. And comparisons on compression
efficiency, RA, view scalability and complexity have been
made. From the experimental results, we can obtain that
MVC with HBP is the most efficient in compression, and
simulcast is most suitable for the applications without strong
bandwidth limitations but with low-delay requirements. Other
schemes have a trade-off between compression efficiency and
functionalities by adopting multiple intra frames, sequential
prediction, multiple reference prediction and I-frame centered
structure etc.
For most multi-view video, the majority of correlation is
mainly in temporally preceding frames. However, there is
also a significant amount of dependencies that are better
predicted from the spatial direction. Multiple reference
prediction may reduce prediction residue and improve coding
efficiency, but inordinate and fulsome multiple reference
predictions not only increase complexity and memory
requirements, but also make against view scalability and
random accessibility. Multiple reference prediction should be
used more effectively according to the relationship between
temporal and inter-view dependencies. Additionally, it is
obvious that view-by-view sequential prediction relieves
exposure and occlusion problems but goes against with RA
performance, view scalability. Fortunately, using B picture in
sequential prediction structure may improve RA a bit. And
improvements on RA and scalability can be achieved by
placing I-frame in the center of the 2D GOP. RA can be
improved by adopting multiple I-frames in a 2D GOP while
at the cost of compression efficiency.
In future work, we will do further researches to propose
some smart prediction structures that predict from spatial
domain, temporal domain or spatial-temporal joint domain
according to the spatial-temporal correlation. And the
structure will provide high coding efficiency, low complexity,
reasonable memory requirements and fast RA.
ACKNOWLEDGMENT
Mitsubishi
Electric
Research
Laboratories,
KDDI
R&D
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
ISO/IEC JTC1/SC29/WG11 N6909, Survey of Algorithms used for

MVC, Hong Kong, Jan. 2005.
M.Tanimoto, T.Fujii, H.Kimata and S.Sakazawa, Proposal on
Requirements for FTV, the 23th JVT Meeting, Doc.JVT-W127,
California, USA, Apr. 2007.
ISO/IEC JTC1/SC29/WG11 N8218, Requirements on Multi-view Video
Coding v.7, Klagenfurt, Austria, July 2006.
U. Fecker, and A. Kaup, H.264/AVC-Compatible Coding of Dynamic
Light Fields Using Transposed Picture Ordering, in Proc 13th European
Signal Processing Conference, Antalya, Turkey, 2005.
Kimata H, Kitahara M, Kamikura K, et al, Low-delay multiview video
coding for free-viewpoint video communication, Systems and Computers
in Japan, vol.38, no.5, pp.14-29, May 2007.
H. Kimata, M. Kitahara, and K. Kamikura, Multi-view video coding
using reference picture selection for free-viewpoint video communication,
in Proc. of the Picture Coding Symposium-2004, pp.499-502, San
Francisco, 2004.
S. Oka, P. N. Bangchang, and T. Fujii, Dynamic Ray-Space Coding for
FTV (in Japanese), in Proc. 3D Image Conference, pp.139-142, Tokyo,
Japan, Jun. 2004.
S. Oka, T. Endo, and T. Fujii, Dynamic Ray-Space Coding using Multidirectional Picture, IEICE Technical Report, vol.104(493), pp.15-20,
Dec. 2004.
P. Merkle, A. Smolic, K. Mller and T. Wiegand, Efficient Prediction
Structures for Multi-view Video Coding, to be published in IEEE
Transactions on Circuits and Systems for Video Technology, 2007.
ISO/IEC JTC1/SC29/WG11 W8019, Description of Core Experiments in
MVC, Montreux, Switzerland, Apr. 2006.
Y. Liu, Q. Huang, X. Ji, D. Zhao, and W. Gao, Multi-view Video Coding
with Flexible View-Temporal Prediction Structure for Fast Random
Access, Lecture Notes in Computer Science, vol.4261, pp.564-571,
2006.
Y. Liu, Q.Huang, D. Zhao, and W. Gao, Low-delay View Random
Access for Multi-view Video Coding, in Proc. ISCAS-2007, pp.9971000, May 2007.
X. Tong, and R.M. Gray, Interactive rendering from compressed light
fields, IEEE Transactions on Circuits and Systems for Video
Technology, vol.13, no.11, pp.1080-1091, Nov. 2003.
J.G. Lou, H. Cai, and J. Li, A RealTime Interactive MultiView Video
System, in Proc. 13th ACM International Conference on Multimedia,
Singapore, Nov. 2005.
G. Jiang, M.Yu, Y. Zhou, and Q. Xu, A New Multi-View Video Coding
Scheme for 3DAV Systems, in Proc. Picture Coding Symposium-2006,
Beijing, China, Apr. 2006.
J.E. Lim, K. N. Ngan, and W.Yang, A multiview sequence CODEC with
view scalability, Signal Processing: Image Communication, vol.19,
pp.239-256, 2004.
H. Schwarz, D. Marpe, and T. Wiegand, Hierarchical B pictures,
presented at the 16th JVT Meeting (JVT-P014), Poznan, PL, Jul. 2005.
U. Fecker, and A. Kaup, Statistical Analysis of Multi- Reference Block
Matching for Dynamic Light Field Coding, ISO/IEC JTC1/SC29/WG11,
Doc.M11546, Hong Kong, China, Jan. 2005.
Yun Zhang received his B.S. and M.S degrees in

information and electronic engineering from Faculty of
Information Science and Engineering, Ning Bo University,
China, in 2004 and 2007. He is now pursing a doctoral
degree in Institute of Computing Technology, Chinese
Academy of Sciences of China. His research interests
mainly include digital video compression and
communications, SoC design and embedded system for
consumer electronics.
Mei Yu received her M.S degree from Hangzhou Institute
of Electronics Engineering, China, in 1993, and Ph.D
degree from Ajou University, Korea, in 2000. She is now a
professor at Faculty of Information Science and
15

Engineering, Ningbo University, China. Her research interests include image and
video coding, and visual perception.
Gangyi Jiang received his M.S degree from Hangzhou
University, China, in 1992, and received his Ph.D degree
from Ajou University, Korea, in 2000. He is now a
professor at Faculty of Information Science and
Engineering, Ningbo University, China. His research
interests mainly include digital video compression and
communications, multi-view video coding, image based
rendering, and image processing.
Regular Paper
16

Serban A. et al.: Microstrip Bias Networks for Ultra-Wideband Systems
Microstrip Bias Networks for

Ultra-Wideband Systems
Adriana Serban, Magnus Karlsson, and Shaofang Gong, Member, Member, IEEE
AbstractBias networks with radio frequency (RF) chokes can

be implemented using different microstrip elements. They can
have different advantages in terms of bandwidth and occupied
area. However, sharp discontinuities of the transfer functions
have been observed in these types of bias networks. In this paper
they are explained by resonances generated within the DC path
of the bias network. As the resonance behavior degrades the
performance of broadband RF circuits, the robustness of
different bias networks against resonance was investigated.
Different bias networks were fabricated and measured. Both
simulation and experimental results show that broadband
microstrip bias networks can be optimized to avoid or reduce the
resonance phenomena.
Index Terms Bias network, broadband amplifier, butterfly
stub, low-noise amplifier, microstrip components, radial stub,
RF-choke.
I. INTRODUCTION
using microstrip transmission lines are

widely used in radio frequency (RF) and microwave
circuits because of their simple structures, easy processing,
and good affinity with the rest of active circuits, e.g., amplifier
and mixer [1]. Due to rapid evolution of the mobile
communication systems towards ultra-wideband (UWB)
systems, the performances of every circuit in a radio chain
must be optimized over large frequency bandwidths. The
general problem of wideband low-noise amplifier (LNA)
design was extensively discussed in last years. Different
methods were developed to simultaneously minimize noise
figure and optimize power gain over wide frequency band [2][5]. Nevertheless, the problem of bias-network design for
typical UWB applications has not been explicitly addressed.
For example, the LNA in a multi-band UWB transceiver in
Band Group 1 defined by [6] must operate over the frequency
interval 3.1-4.8 GHz. For such an amplifier the bias network
must correctly operate at least over the same frequency bands.
However, even if the bandwidth specification is achieved for
the bias network itself, single or multiple, sharpdiscontinuities (notches) of the transfer function have been
observed when the complete UWB LNA module was
IAS NETWORKS
Manuscript received October 8, 2007. Ericsson AB in Sweden is

acknowledged for financial support of this work.
Adriana Serban; email: adrse@itn.liu.se, Magnus Karlsson; email:
magka@itn.liu.se, and Shaofang Gong are with Linkping University,
Sweden.
simulated [3]. Furthermore, resonances have been reported in

other RF circuits where band-limited RF chokes are present
[7]-[8]. In the case of LNA, these notches can critically
degrade the performance, as both the noise figure and the
flatness of the power gain are affected. Moreover, simulation
results indicated the possibility of unstable operation of the
amplifier.
At RF frequencies, using bias circuits implemented with
distributed components in the form of microstrip transmission
lines, the bandwidth limitation of bias circuits implemented
with discrete components can be avoided [1]. Classical biasnetworks have the characteristic of bandpass filters using
impedance transforming properties of quarter-wave
transmission lines to generate a virtual RF ground. They
provide good isolation between the RF and DC ports and are
referred usually as RF chokes. Different types of microstrip
stubs can be used to implement the RF choke, such as straight
microstrip stub, radial stub and butterfly stub. Analytical
models and accurate algorithms for modeling and
characterizing of radial stubs have been developed for a long
time [9]-[15]. Their properties and performances have been
compared to those of straight stubs when used for impedance
tuning, bias networks, lowpass and bandpass filters. Butterfly
stubs models have also been developed and verified in
different wideband filter applications [16]-[17]. Recent
investigations [18]-[23] supported by the extraordinary
development of electronic design automation (EDA)
simulation tools [24] have confirmed their applicability in
different type of modern high-frequency circuits. Radial stub
have been used in order to improve the spurious response of
bandpass filters [18], while butterfly stubs have been
innovatively used as Photonic Bandgap (PBG) cells [19] and
in dual-mode ring bandpass filters [20]-[21] with increased
bandwidth and good rejection properties. For further increased
bandwidth and more compact area, novel RF components
based on the butterfly stub were discussed in [22]-[23].
In this paper, bias networks implemented with different
distributed components are studied with focus on bandwidth
and their robustness to secondary effects. Simulation results
and experimental measurements are compared with each
other.
II. MICROSTRIP BIAS NETWORKS WITH RF CHOKE
Classically, an RF amplifier can be described as a three-

17
stage circuit, see Fig. 1. It consists of an active device,

symbolized by the transistor, the input matching network
(IMN) and the output matching network (OMN).
/4
/4
RFout
OMN
RFin
Cout
IMN
Cin
Fig. 1. Simplified schematic of an amplifier including bias network (dashed
line) with RF choke.
In Fig. 1 Cin and Cout are DC-block capacitors and C is the

decoupling capacitor for suppressing the power supply noise.
Design techniques for broadband RF amplifiers focus on the
amplifier topology or on special techniques providing
wideband input- and output matching networks [3]-[5].
However, the DC bias network, indicated by the dashed line in
Fig. 1, must also operate correctly over at least the same
frequency band. The main component in a bias network is the
RF choke. As RF chokes using discrete inductors may not
have sufficient bandwidth, microstrip RF chokes are instead
used [1]. As seen in Fig.1, the microstrip RF choke is
composed of two quarter wavelength (/4) transformers: a /4
open-circuited stub and a /4 connecting line, where is the
wavelength.
The open-circuited stub can be implemented using different
microstrip elements, as shown in Figs. 2a-2c. It is known that
the radial stub can provide broader bandwidth than a straight
microstrip stub and can be used successfully in some
broadband bias networks [9]-[15]. In this paper, a third
implementation of the RF choke is considered and analyzed,
i.e., the bias network with RF choke using a butterfly stub
shown in Fig. 2c.
(a)
(b) Port 3
Port 3
lA
/4
Port 1
(c)
rO,RS
/4
B
B
Port 3
RF choke
A
Port 2
Port 1
rO,BS
B
/4
Port 2
A
Port 1
/4
Port 2
Fig. 2. Bias network with RF choke using (a) microstrip stub, (b) radial stub,
and (c) butterfly stub.
Material
RO4350B
Dielectric thickness
Dielectric constant
Dissipation factor
Metal thickness
Metal conductivity
Surface roughness
VCC
TABLE I
PCB PROCESS PARAMETERS
0.254 mm
3.48 0.05
0.0037
0.035 mm
5.8 x 107 S/m
0.001 mm
The connection of the bias network to an active circuit, e.g.,

the RF amplifier shown in Fig. 1, should not disturb the RF
signal. A low-loss RF-route from Port 1 to Port 2 shown in
Fig. 2 is achieved by using the RF choke which, in theory,
provides infinite impedance towards Port 3, i.e., the DC bias
port. At the center frequency, ideally, the open-circuited stub
provides a low impedance level RF short at point B. The
/4 transmission line transforms the RF short at point B to an
RF open, i.e., infinite impedance at junction point A. For
wideband applications it is important that these requirements
are met not only at the center frequency but also over the
entire bandwidth.
III. SIMULATIONS AND MEASUTEMENT SET-UPS
The three bias networks presented in Fig. 2 were designed
and simulated using Advanced Design System (ADS) 2005A
from Agilent Technologies Inc. Simulations were done on
both schematic and layout levels. Layout level simulations
were done with the electromagnetic (EM) simulator
Momentum in ADS. The bias networks were manufactured
using a two-layer printed circuit board (PCB) with the Rogers
material RO4350B. The PCB parameters are listed in Table I
Fig. 3. Photograph of the three manufactured bias networks shown in Fig. 2,

and centimeter scaled ruler.
The 50 transmission line has the width w = 0.542 mm.

The length of the /4 transmission line at 4 GHz is l = 11.5
mm. Referring to Figs. 2b and 2c, the radial stub has the
optimized radius rO,RS = 6.8 mm while the butterfly stub has
the radius rO,BS = 8.3 mm. The angle for both radial and
butterfly stubs is = 60. A photograph of manufactured bias
networks is shown in Fig. 3.
Measurements were done with a Rohde&Schwartz ZVM

18
vector network analyzer. All bias networks were designed for

the 3.1 4.8 GHz frequency band, i.e., the Band Group 1 in a
multiband UWB system [6].
As the DC supply-line is usually shunt-connected with
several decoupling capacitors to reduce noise from the DCsource, the load impedance at Port 3 shown in Fig. 2 is more
close to an RF short than a 50 . In simulations and
measurements Port 3 is terminated with an RF short to
emulate decoupling capacitors.
By comparing the EM simulated transmission coefficients

to the measured ones, it can be seen that a good agreement has
been obtained for all three bias networks except some
deviation above 5.5 GHz. As expected, the bias network using
the radial stub has wider bandwidth than that using the straight
line stub. It is also important to notice that the bias network
using the butterfly stub not only has the widest bandwidth but
also is most robust against the sharp discontinuities of the
transmission function. Fewer notches with smaller amplitude
can be identified in Fig. 4c, as compared to Figs. 4a and 4b.
IV. SIMULATIONS AND MEASUREMENT RESULTS
Forward transmission (dB)
Use Figs. 4a-4c show the transmission coefficient, |S21|, of

the bias networks implemented with a microstrip stub, a radial
stub, and a butterfly stub, respectively. The solid line curves
represent the measured data, while the dashed line curves
simulated data.
0
-10
Measured
Simulated
-20
-30
-40
1
V. DISCUSSION
These discontinuities shown in Fig. 4 can be explained by
resonances in the DC-path of the bias network. To verify the
resonance phenomena along the DC path, the input impedance
amplitude at point A (see Fig. 1) when looking into the RFchoke, |ZinA|, is analyzed. Based on the simulated input
reflection coefficient S11, |ZinA| was calculated for the
microstrip stub bias network.
In Fig. 5, comparing |ZinA| to the transmission coefficient, it
can be seen that the discontinuities of the |S21| appear at the
same frequencies where |ZinA| 0. Thus, it can be concluded
that discontinuities of the transmission coefficient are caused
by equivalent series resonance in the DC path. The DC path
from junction point A to Port 3 (see Fig. 2) results in a shortcircuited /2 transmission line resonator.
0
-10
Measured
-20
Simulated
-30
10000
|S21|
|ZinA|
-10
1000
-20
100
-30
10
-40
1
1
Frequency (GHz)
-40
1
Frequency (GHz)
(b) Radial stub, layout length lA = 28 mm and rO,RS = 6.8 mm, see Fig 2b.
|ZinA| ()
(a) Microstrip stub, layout length lA = 28 mm and /4=11.5 mm, see Fig 2a.
Frequency (GHz)
0
-10
Measured
-20
Simulated
-30
-40
1
Frequency (GHz)
(c) Butterfly stub, layout length lA = 28 mm and rO,BS = 8.3 mm, see Fig 2c.
Fig. 4. Simulated and measured transmission coefficients of the bias networks
with (a) microstrip stub, (b) radial stub, and (c) butterfly stub.
Fig. 5. Microstrip stub bias network. |S21| and the input impedance at point A
(see Fig. 2 ), |ZinA|.
The origin of such resonance phenomena can be explained

by (a) leakage of the RF signal into the DC-route and (b) near
zero termination of the DC port, i.e., Port 3 in Fig. 2.
Since the length lA from point B to Port 3 (see Fig.2) is a
layout-dependent and arbitrary length, it is interesting to
analyze its influence on the transmission coefficient.
Simulation results presented in Fig. 6 show that, if lA is
increased to lA = 100 mm, more resonances within the
frequency band degrade the performance of the three bias
networks as compared to those shown in Fig. 4 where lA = 28
mm. The worst is the case from the microstrip stub bias
network, while the butterfly stub bias network is not only
more broad-banded, but also less affected by the resonance
phenomenon.

19
0
-10
-20
-30
Choosing the appropriate microstrip components, e.g., a

butterfly stub instead of a radial stub can also minimize the
effect of such resonances on the performance of broadband
RF circuits.
Microstrip stub
0
-10
-20
-30
0
-10
-20
-30
VI. CONCLUSION
Radial stub
4
5
Butterfly stub
Frequency (GHz)
Fig. 6. Simulated transmission coefficient when lA=100 mm.
Fig. 7 shows simulation results when the termination

resistance at Port 3 is 10, 50 and 90 , respectively. It is
important to notice that the RF choke using the butterfly stub
is the one least influenced by different termination conditions.
0
-5
-10
-15
-20
Microstrip stub
0
-5
-10
-15
-20
4
5
Radial stub
Undesired, single or multiple, sharp discontinuities of the

broadband bias network transfer function have been observed.
They can be explained by equivalent series resonances
generated within the DC path of the bias network. The origin
of such resonance phenomena can be explained by (a) leakage
of the RF signal into the DC-route and (b) near zero
termination of the DC port. In our experiments the DC path,
from junction point A to Port 3 shown in Fig.2, results in a
short-circuited /2 transmission line resonator.
The microstrip structure of the RF choke using straight line,
radial or butterfly stub determines the blocking frequency
range and therefore the robustness of the RF choke. The
butterfly stub is best suited for broadband RF choke
applications. The RF choke using the butterfly stub gives not
only the broadest band characteristic but also the most robust
bias network towards (a) different layout geometries
connecting the RF choke to the DC port, and (b) load
impedance variation of the DC port.
REFERENCES
2
0
-5
-10
-15
-20
1
4
5
Butterfly stub
10
50
90
6
[1]
[2]
[3]
Frequency (GHz)
Fig. 7. Simulated transmission coefficient when the DC port termination

resistance is 10, 50 and 90 , respectively.
Depending on applications, a circuit layout can be more

complex than the basic configurations shown in Figs. 2a-2c.
Stand alone characterization of components like RF-chokes is
not enough, their surroundings, i.e., in particular neighboring
components in the system affects their performance. By
including other necessary components, e.g., decoupling
capacitors, stabilization resistors and matching networks, the
transmission coefficient of the bias network can be disturbed
by resonance phenomena whenever a non-optimized RF
choke is used. As a consequence, key parameters of an active
circuit, e.g., power gain and noise figure, could be seriously
affected. Moreover, as the bandwidth of todays RF circuits
increases, resonance phenomena can appear at different
frequencies in the band. In order to identify their presence
within the defined operation frequency band, it is important to
perform layout level simulations of the complete RF circuit,
including the bias network, all passive components, via holes,
pads and terminations. Using EM simulation tools, the
broadband bias network using band-limited RF chokes can be
optimized to avoid the appearance of resonance phenomena.
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
S. Hong and M. J. Lancaster, Microstrip Filters for RF/Microwave

Applications, John Wiley & Sons, Inc., 2001, Chapter 6, pp. 188-190.
S.-L.S. Yang, Q. Xue; K.-M. Luk, A Wideband Low-Noise Amplifiers
Design Incorporating the CMRC Circuitry, IEEE Microwave and
Wireless Components Letters, vol. 15, pp. 315 317, May 2005.
A. Serban and S. Gong, Ultra-wideband Low-Noise Amplifier Design
for 3.1-4.8 GHz, in Proc. GigaHertz 2005 Conf., Uppsala, Sweden, 8-9
Nov. 2005, pp. 291-294.
R. Ludwig and P. Bretchko, RF Circuit Design. Theory and
Applications, Prentice Hall 2000.
G. Gonzalez, Microwave Transistor Amplifier Design. Analysis and
Design, Prentice Hall 1997.
First report order, revision of part 15 of commissions rules regarding
ultra-wideband transmission systems, FCC., Washington, 2002.
L. Jiang and S. C. Shi, Investigation on the Resonance Observed in the
Embedding-Impedance Response of an 850-GHz Waveguide HEB
Mixer, IEEE Microwave and Wireless Components Letters, vol. 15, No.
4, pp. 196-198, Apr. 2005.
Wenhua Yu and Raj Mittra, Accurate modeling of planar microwave
circuit using FDTD algorithm, IEEE Electronics Letters, vol. 36. No. 7,
pp. 618-619, Mar. 2000.
B. A. Syrett, A Broad-Band Element for Microstrip Bias or Tuning
Circuits, IEEE Trans. Microwave Theory and Tech., vol. MTT-28. No.
8, pp. 925-927, Aug. 1980.
H. A. Atwater, Microstrip Reactive Circuit Elements, IEEE Tran.
Microwave Theory and Tech., Vol. 83, pp.488 491, Jun. 1983.
F. Giannini, R. Sorrentino and J. Vrba, Planar Circuit Analysis of
Microstrip Radial Stub,, IEEE Trans. Microwave Theory and Tech.,
vol. 32, pp.1652-1655, Dec. 1984.
F. Giannini, M. Ruggieri, J. Vrba, Shunted-Connected Microstrip
Radial Stub, IEEE Trans. Microwave Theory Tech., vol. 34, no. 3,
pp.3632-366, Mar. 1986.
S. L. March, Analyzing Lossy Radial-Line Stubs, IEEE Trans.
Microwave Theory and Tech., vol. 33, pp. 269 271, Mar. 1985.
R. Sorrentino, L. Roselli, A New Simple and Accurate Formula for
Microstrip Radial Stub, IEEE Trans. Microwave Theory and Tech., vol.
2, no. 12, pp.480-482, Dec. 1992.
20

[15] I. Sakagami, Y. Hao, M. Mohemaiti, A. Tokunou, On a transmissionline Butterworth lowpass filter using radial stubs, IEEE International
Symposium on Circuits and Systems, 2002, ISCAS 2002, vol. 3, pp. 867
870.
[16] F. Giannini, C. Paoloni, M. Ruggieri, CAD-Oriented Lossy Models for
Radial Stubs, IEEE Tran. Microwave Theory and Tech., vol. 36, pp.
305 313, Feb. 1988.
[17] F. Giannini, M Salerno, R. Sorrentino, Two-Octave Stopband
Microstrip Low-Pass Filter Design using Butterfly Stubs, 16th
European Microwave Conference, 1986, pp. 292 297.
[18] J. Zhu, Z. Feng, Enhancement of Stopband Rejection of Microstrip
Bandpass Filters by Radial Stubs, International Conference on
Microwave and Millimeter Wave Technology, 2007, ICMMT '07, April
2007, pp. 1 3.
[19] B. T. Tan, J. J. Yu, S. J. Koh, S. T. Chew, Investigation into Broadband
PBG Using a Butterfly-Radial Slot (BRS), Microwave Symposium
Digest, 2003 IEEE MTT-S International, vol.2, 8-13 Jun. 2003, pp. 1107
1110.
[20] B. T. Tan, J. J. Yu, S. T. Chew M.-S. Leong, B.-L. Ooi, A Miniaturized
Dual-Mode Ring Bandpass Filter with a New perturbation, IEEE Tran.
Microwave Theory and Tech., vol. 53, pp. 343 348, Jan. 2005.
[21] R.-J. Mao, X.-H. Tang, F. Xiao, Miniaturized Dual-Mode Ring
Bandpass Filters With Patterned Ground Plane, IEEE Tran. Microwave
Theory and Tech., vol. 55, pp.1539 1547, Jul. 2007.
[22] R. K. Joshi, A. R. Harish, Characteristics of a Rotated Butterfly Radial
Stub, Microwave Symposium Digest, 2006. IEEE MTT-S International,
Jun. 2006, pp.1165 1168.
[23] R. Dehbashi, K. Forooraghi, Z. Atlasbaf, N. Amiri, A Novel BroadBand Band-Stop Resonator with Compact Size, Asia-Pacific
Conference on Applied Electromagnetics, 2005, APACE 2005, Dec.
2005.
[24] Agilent Technologies Inc., Advanced Design System (ADS),
http://eesof.tm.agilent.com/.
Adriana Serban received the M.Sc degree in electronic

engineering from Politehnica University, Bucharest,
Romania. From 1981 to 1990 she was with
Microelectronica Institute, Bucharest as a Principal
Engineer where she was involved in mixed integrated
circuits design. From 1992 to 2002 she was with Siemens
AG, Munich, Germany and with Sicon AB, Linkoping,
Sweden as analog and mixed signal integrated circuits Senior Design
Engineer. Since 2002 she is a Lecturer at Linkoping University teaching in
analog/digital system design and RF circuit design. She works towards her
Ph.D degree in Communication Electronics. Her main research interest has
been RF circuit design and high-speed integrated circuit design.
Magnus Karlsson was born in Vstervik, Sweden in
1977. He received his M.Sc. and Licentiate of
Engineering from Linkping University in Sweden, in
2002 and 2005, respectively.
In 2003 he started his Ph.D. study in the
Communication Electronics research group at Linkping
University. His main work involves wideband antennatechniques and wireless communications.
Shaofang Gong was born in Shanghai, China, in 1960.
He received his B.Sc. degree from Fudan University in
Shanghai in 1982, and the Licentiate of Engineering and
Ph.D. degrees from Linkping University in Sweden, in
Between 1991 and 1999 he was a senior researcher at
the microelectronic institute Acreo in Sweden. From
2000 to 2001 he was the CTO at a spin-off company from the institute. Since
2002 he has been full professor in communication electronics at Linkping
University, Sweden. His main research interest has been communication
electronics including RF design, wireless communications and high-speed
data transmissions.
Regular Paper
21

Blackledge J.M.: Multi-algorithmic Cryptography using Deterministic Chaos with
Applications to Mobile Communications
Multi-algorithmic Cryptography using Deterministic

Chaos with Applications to Mobile Communications
Jonathan M Blackledge, Fellow, IET, Fellow, BCS, Fellow, IMA, Fellow, RSS
Abstract In this extended paper, we present an overview

of the principal issues associated with cryptography, providing
historically signicant examples for illustrative purposes as part
of a short tutorial for readers that are not familiar with the
subject matter. This is used to introduce the role that nonlinear
dynamics and chaos play in the design of encryption engines
which utilise different types of Iteration Function Systems (IFS).
The design of such encryption engines requires that they conform
to the principles associated with diffusion and confusion for
generating ciphers that are of a maximum entropy type. For
this reason, the role of confusion and diffusion in cryptography is
discussed giving a design guide to the construction of ciphers that
are based on the use of an IFS. We then present the background
and operating framework associated with a new product CrypsticTM - which is based on the use of multi-algorithmic IFS
to design an encryption engine that is mounted on a USB memory
stick and uses both disinformation and obfuscation to hide
a forensically inert application. The protocols and procedures
associated with the use of this product are also discussed.
Index Terms Cryptography, Nonlinear Dynamics, Iteration
Function Systems, Chaos, Multi-algorithmicity
I. I NTRODUCTION
HE quest for inventing innovative techniques which only

allow authorized users to transfer information that is
impervious to attack by others has, and continues to be, an
essential requirement in the communications industry (e.g.
[1], [2], [3]). This requirement is based on the importance
of keeping certain information secure, obvious examples being
military communications and nancial transactions, the former
example, being a common theme in the history and development of Cryptology [4], [5], [6].
Cryptography is the study of mathematical and computational techniques related to aspects of information security
(e.g. [7]-[9]). The word is derived from the Greek Kryptos, meaning hidden, and is related to disciplines such as
Cryptanalysis and Cryptology. Cryptanalysis is the art of
breaking cryptosystems by developing techniques for the retrieval of information from encrypted data without having a
priori knowledge of the required decryption process (typically,
knowledge of the key) [10]. Cryptology is the science that
underpins Cryptography and Cryptanalysis and can include
Manuscript received November 1, 2007. The work reported in this paper
has been supported by the Council for Science and Technology, Management
and Personnel Services Limited and Crypstic Limited
Jonathan Blackledge is Professor of Information and Communications
Technology, Applied Signal Processing Research Group, Department of
Electronic and Electrical Engineering, Loughborough University, England and
Professor of Computer Science, Department of Computer Science, University of the Western Cape, Cape Town, Republic of South Africa (e-mail:
jon.blackledge@btconnect.com).
a broad range of mathematical concepts, computational algorithms and technologies. In other words, Cryptology is
a multi-disciplinary subject that covers a wide spectrum of
different disciplines and increasingly involves using a range of
engineering concepts and technologies through the innovation
associated with term technology transfer. These include areas
such as Synergetics, which is an interdisciplinary science
explaining the formation and self-organization of patterns and
structures in non-equilibrium open systems and Semiotics,
which is the study of both individual and grouped signs and
symbols, including the study of how meaning is constructed
and understood [11].
Cryptology is often concerned with the application of formal
mathematical techniques to design a cryptosystem and to
estimate its theoretical security. This can include the use of
formal methods for the design of security software which
should ideally be a safety critical [12]. However, although
the mathematically dened and provable strength of a cryptographic algorithm or cryptosystem is necessary, it is not a
sufcient requirement for a system to be acceptably secure.
This is because it is difcult to estimate the security of a
cryptosystem in any formal sense when it is implemented
under operational conditions that can not always be predicted and thus, simulated. The security associated with a
cryptosystem can be checked only by means of proving its
resistance to various kinds of known attack that are likely
to be implemented. However, in practice, this does not mean
that the system is secure since other attacks may exist that
are not included in simulated or test conditions. The reason
for this is that humans possess a broad range of abilities from
unbelievable ineptitude to astonishing brilliance which can not
be formalised in a mathematical sense or on a case by case
basis.
The practical realities associated with Cryptology are indicative of the fact that security a process, not a product
[13]. Whatever the sophistication of the security product
(e.g. the encryption and/or key exchange algorithm(s), for
example), unless the user adheres strictly to the procedures and
protocols designed for its use, the product can be severely
compromised. A good example of this is the use of the Enigma
[14] cipher by Germany during the Second World War. It was
not just the intelligence of the code breakers at Bletchley
Park in England that allowed the allies to break many of the
Enigma codes but the irresponsibility and, in many cases,
the sheer stupidity of the way in which the system was used
by the German armed and intelligence services at the time.
The basic mechanism for the Enigma cipher, which had
been developed as early as 1923 by Artur Schubius for se-
22

curing nancial transactions, was well known to the allies due

primarily to the efforts of the Polish Cipher Ofce at Poznan in
the 1930s. The distribution of some 10000 similar machines to
the German army, navy and air force was therefore a problem
waiting to happen. The solution would have been to design
a brand new encryption engine or better still, a range of
different encryption engines given the technology of the time,
and use the Enigma machine to propagate disinformation.
Indeed, some of the new encryption engines introduced by
the Germans towards the end of the Second World War were
not broken by the allies.
These historically intriguing insights are easy to contemplate in hindsight, but they can also help to focus on the
methodologies associated with developing new technologies
for knowledge management which is a focus of the material
considered in this work. Here, we explore the use of deterministic chaos for designing ciphers that are composed of
many different pseudo chaotic number generating algorithms
- Meta-encryption-engines. This multi-algorithmic or Metaengine approach provides a way of designing an unlimited
class of encryption engines as opposed to designing a single
encryption engine that is operated by changing the key(s)
- which for some systems, public key system in particular,
involves the use of prime numbers. There are of course a
number of disadvantages to this approach which are discussed
later on but it is worth stating at this point, that the principal
purpose for exploring the application of deterministic chaos in
cryptography is:
the non-reliance of such systems on the use of prime
numbers which place certain limits on the characteristics
and arithmetic associated with an encryption algorithm;
the unlimited number of chaos based algorithms that can
be, quite literally, invented to produce a meta-encryption
engine.
II. I NFORMATION AND K NOWLEDGE M ANAGEMENT
With regard to information security and the management
of information in general, there are some basic concepts that
are easy to grasp and sometimes tend to get lost in the detail.
The rst of these is that the recipient of any encrypted message
must have some form of a priori knowledge on the method (the
algorithm, for example) and the operational conditions (e.g.
the key) used to encrypt a message. Otherwise, the recipient
is in no better state of preparation than the potential attacker.
The conventional approach is to keep this a priori information
to a minimum but in such a way that it is critical to the
decryption process. Another important reality is that in an
attack, if the information transmitted is not deciphered in good
time, then it may become redundant. Coupled with the fact
that an attack usually has to focus on a particular criterion
(such as a specic algorithm), one way to enhance the security
of a communications channel is to continually change the
encryption algorithm and/or process offered by the technology
available.
Another approach to information management is to disguise
or camouage the encrypted message in what would appear
to be innocent or insignicant data such as a digital
photograph, a music le or both, for example1 . This is known

as Steganography [15]-[17]. Further, the information security
products themselves should be introduced and organised
in such a way as to reect their apparent insignicance in
terms of both public awareness and nancial reward which
helps to combat the growing ability to hack and crack using
increasingly sophisticated software that is readily available,
e.g. [18]. This is of course contrary to the dissemination
of many encryption systems, a process that is commonly
perceived as being necessary for business development through
the establishment of a commercial organisation, international
patents, distribution of marketing material, elaborate and sophisticated Web sites, authoritative statements on the strength
of a system to impress customers, publications and so on.
Thus, a relatively simple but often effective way of maintaining
security with regard to the use of an encryption system is to
not tell anyone about it. The effect of this can be enhanced
by publishing other systems and products that are designed
to mislead the potential attacker. In this sense, information
management and Information and Communication Technology
(ICT) security products should be treated in the same way as
many organisations treat a breach of security, i.e. not to publish
the breach in order to avoid embarrassment and loss of faith
by the client base.
A. Secrets and Ultra-secrets
A classic mistake (of historical importance) of not keeping it quiet, in particular, not maintaining silent warfare
[19], was made by Winston Churchill when he published
his analysis of World War I. In his book, The World Crisis
1911-1918, published in 1923, he stated that the British had
deciphered the German Naval codes for much of the war
as a result of the Russians salvaging a code book from the
small cruiser Magdeburg that had ran aground off Estonia on
August 27, 1914. The code book was passed on to Churchill
who was, at the time, the First Sea Lord. This helped the
British maintain their defences with regard to the German
navy before and after the Battle of Jutland in May, 1916. The
German navy became impotent which forced Germany into a
policy of unrestricted submarine warfare. In turn, this led to an
event (the sinking on May 7, 1915 of the Lusitania, torpedoed
by a German submarine, the U-20) that galvanized American
opinion against Germany and played a key role in the United
States later entry into the Fisrt World War on April 17, 1917
and the defeat of Germany [20], [21].
Churchills publication did not go un-noticed by the German military between the First and Second World Wars.
Consequently, signicant efforts were made to develop new
encryption devices for military communications. This resulted
in the famous Enigma machine, named after Sir Edward
Elgars masterpiece, the Enigma Variations [22]. Enigma was
an electro-mechanical machine about the size of a portable
typewriter which, through application of both electrical (plugboard) and mechanical (rotor) settings offered 2 1020
permutations for establishing a key. The machine could be
1 By encoding the encrypted message in the least signicant bit or bit-pair
of the host data, for example.
23

used without difculty by semi-skilled operators under the

most extreme battle conditions. The keys could be changed
daily or several times a day according to the number of
messages transmitted.
The interest in cryptology by Germany that was undoubtedly stimulated by Churchills indiscretions included establishing a specialist cipher school in Berlin. Ironically, it was at this
School that some of the Polish mathematicians were trained
who later worked for the Polish Cipher Ofce, opened in utmost secrecy at Poznan in 1930 [23], [24]. In January 1929, the
Dean of the Department of Mathematics, Professor Zdzislaw
Krygowski from the University of Poznan, provided a list of
his best graduates to start working at this ofce. One of these
graduates was the brilliant young logician, Marian Rejewski
who pioneered the design of the Bomba kryptologiczna, an
electro-mechanical device used for eliminating combinations
that had not been used to encrypt a message with the Enigma
cipher [25]. However, the design of the Bomba kryptologiczna
was only made possible through the Poles gaining access to the
Enigma machine and obtaining knowledge of its mechanism
without alerting the Germans to their activities. In modern
terms, this is equivalent to obtaining information on the type
of encryption algorithm used in a cryptosystem.
The Bomba kryptologiczna helped the Poles to decipher
some 100,000 Enigma messages from as early January 1933 to
September 1939 including details associated with the remilitarization of the Rhine Province, Anschluss of Austria and
seizure of the Sudetenland. It was Rejewskis original work
that formed the basis for designing the advanced electromechanical and later, the electronic decipher machines (including Colossus - the worlds rst programmable computer)
constructed and utilized at Bletchley Park between 1943 and
1945 [26], [27].
After the Second World War, Winston Churchill made sure
that he did not repeat his mistake, and what he referred to
as his Ultra-secret - the code breaking activities undertaken
at Station X in Bletchley Park, England - was ordered by
him to be closed down and the technology destroyed soon
after the end of the war. Further, Churchill never referred
to his Ultra-secret in any of his publications after the war.
Those personnel who worked at Bletchley Park were required
to maintain their silence for some fty years afterwards and
some of the activities at Bletchley Park remain classied to
this day. Bletchley Park is now a museum which includes
a reconstruction of Colossus undertaken in the mid-1990s.
However, the type of work undertaken there in the early 1940s
continues in many organisations throughout the world such
as the Government Communications Head Quarters (GCHQ)
based at Cheltenham in England [28] where a range of
code making and code breaking activities continue to be
developed.
The historical example given above clearly illustrates the
importance of maintaining a level of secrecy when undertaking
cryptographic activities. It also demonstrates the importance
of not publishing new algorithms, a principle that is at odds
with the academic community; namely, that the security of
a cryptosystem should not depend upon algorithm secrecy.
However, this has to be balanced with regard to the dissem-
ination of information in order to advance a concept through

peer review, national and international collaboration. Taken to
an extreme, the secrecy factor can produce a psychological
imbalance that is detrimental to progress. Some individuals
like to use condential information to enhance their status.
In business, this often leads to issues over the signing of
Non-Disclosure Agreements or NDAs, for example, leading
to delays that are of little value, especially when it turns out
that there is nothing worth disclosing. Thus, the whole issue
of keeping it quiet has to be implemented in a way that is
balanced, such that condentiality does not lead to stagnation
in the technical development of a cryptosystem. However, used
correctly and through the appropriate personality, issues over
condentiality coupled with the feel important factor can be
used to good effect in the dissemination of disinformation.
B. Home-Spun Systems Development
The development and public use of information security
technology is one of the most interesting challenges for state
control over the information society. As more and more
members of the younger generation become increasingly IT
literate, it is inevitable that a larger body of perfectly able
minds will become aware of the fact that cryptology is
not as difcult as they may have been led to believe. As
with information itself, the days when cryptology was in the
hands of a select few with impressive academic credentials
and/or luxury civil service careers are over and cryptosystems
can now be developed by those with a diverse portfolio of
backgrounds which does not necessarily include a University
education. This is reected in the fact that after the Cold War,
the UK Ministry of Defense, for example, developed a strategy
for developing products driven by commercially available
systems. This Commercial-Off-The-Shelf or COTS approach
to defence technology has led directly to the downsizing of
the UK Scientic Civil Service which, during the Cold War,
was a major source of scientic and technical innovation.
The average graduate of today can rapidly develop the
ability to write an encryption system which, although relatively
simple, possibly trivial and ill-informed, can, by the very nature of its non-compliance to international standards, provide
surprisingly good security. This can lead to problems with
the control and management of information when increasingly
more individuals, groups, companies, agencies and nation
states decide that they can go it alone and do it themselves.
While each home grown encryption system may be relatively
weak, compared to those that have had expert development
over many years, have been well nanced and been tested
against the very best of attack strategies, the proliferation of
such systems is itself a source of signicant difculty for any
authority whose role is to monitor communications trafc in a
way that is timely and cost effective. This is why governments
world-wide are constantly attempting to control the use and
exploitation of new encryption methods in the commercial
sector2 . It also explains the importance of international encryption standards in terms of both public perception and
2 For example, the introduction of legislation in mainland UK concerning
the decryption of massages by a company client through enforcement of the
Regulation of Investigatory Powers (RIP) Act, 2000.
24

free market exploitation. Government and other controlling

authorities like to preside over a situation in which everybody
else is condently reliant for their information security on
products that have been developed by the very authorities
that encourage their use, a use that is covertly diffused into
the information society through various legitimate business
ventures coupled with all the usual commercial sophistication
and investment portfolios. Analysis of this type can lead to
a range of unsubstantiated conspiracy theories, but it only by
thinking through such possible scenarios, that new concepts in
information management, some of which may be of practical
value, are evolved. The proliferation of stand-alone encryption
systems that are designed and used by informed individuals
is not only possible but inevitable, an inevitability that is
guided by the principle that if you want to know what you are
eating then you should cook it yourself. Security issues of this
type have become the single most important agenda for future
government policy on information technology, especially when
such systems have been home spun by those who have
learned to fully respect that they should, in the words of
Shakespeare, Neither a borrower, nor a lender be3 .
C. Disinformation
Disinformation is used to tempt the enemy into believing
certain kinds of information. The information may not be true
or contain aspects that are designed to cause the enemy to
react in an identiable way that provides a strategic advantage
[29], [30]. Camouage, for example, is a simple example of
disinformation [31]. This includes techniques for transforming
encrypted data into forms that resemble the environments
through which an encrypted message is to be sent [32], [33].
At a more sophisticated level, disinformation can include
encrypted messages that are created with the sole purpose of
being broken in order to reveal information that the enemy
will react to by design.
Disinformation includes arranging events and processes
that are composed to protect against an enemy acquiring
knowledge of a successful encryption technology and/or a
successful attack strategy. A historically signicant example of
this involved the Battle of Crete which began on the morning
of 20 May 1941 when Nazi Germany launched an airborne
invasion of Crete under the code-name Unternehmen Merkur
(Operation Mercury) [34]. During the next day, through miscommunication and the failure of Allied commanders to grasp
the situation, the Maleme aireld in western Crete fell to the
Germans which enabled them to y in heavy reinforcements
and overwhelm the Allied forces. This battle was unique in
two respects: it was the rst airborne invasion in history4 ;
it was the rst time the Allies made signicant use of their
ability to read Enigma codes. The British had known for some
weeks prior to the invasion of Crete that an invasion was
likely because of the work being undertaken at Bletchley Park.
They faced a problem because of this. If Crete was reinforced
in order to repel the invasion then Germany would suspect
3 From
William Shakespeares play, Hamlet.

the potential of paratroopers and so initiating the Allied
development of their own airborne divisions.
4 Illustrating
that their encrypted communications were being compromised.

But this would also be the case if the British and other
Allied troops stationed on Crete were evacuated. The decision
was, therefore, taken by Churchill to let the German invasion
proceed with success but not without giving the invaders
a bloody nose. Indeed, in light of the heavy casualties
suffered by the parachutists, Hitler forbade further airborne
operations and Crete was dubbed the graveyard of the German
parachutists. The graveyard for German, British, Greek and
Allied soldiers alike was not a product of a ght over desirable
and strategically important territory (at least for the British). It
was a product of the need to secure Churchills Ultra-secret.
In other words, the Allied efforts to repulse the German
invasion of Crete was, in reality, a form of disinformation,
designed to secure a secret that was, in the bigger picture,
more important than the estimated 16,800 dead and wounded
that the battle cost.
D. Plausible Deniability
Deniable encryption allows an encrypted message to be
decrypted in such a way that different and plausible plaintexts
can be obtained using different keys [35]. The idea is to make
it impossible for an attacker to prove the existence of the real
message, a message that requires a specic key. This approach
provides the user with a solution to the gun to the head
problem as it allows the sender to have plausible deniability
if compelled to give up the encryption key.
There are a range of different methods that can be designed
to implement such a scheme. For example, a single ciphertext
can be generated that is composed of randomised segments
or blocks of data which correlate to blocks of different
plaintexts encrypted using different keys. A further key is
then required to assemble the appropriate blocks in order to
generate the desired decrypt. This approach, however, leads to
ciphertext les that are signicantly larger than the plaintexts
they contain. On the other hand, a ciphertext le should not
necessarily be the same size as the plaintext le and padding
out the plaintext before encryption can be used to increase the
Entropy of the ciphertext (as discussed in Section VIII).
Other methods used for deniable encryption involve establishing a number of abstract layers that are decrypted to yield
different plaintexts for different keys. Some of these layers are
designed to include so-called chaff layers. These are layers
that are composed of random data which provide the owner of
the data to have plausible deniability of the existence of layers
containing the real ciphertext data. The user can store decoy
les on one or more layers while denying the existence of
others, identifying the existence of chaff layers as required.
The layers are based on le systems that are typically stored
in a single directory consisting of les with lenames that
are either randomized (in the case where they belong to chaff
layers), or are based on strings that identify cryptographic data,
the timestamps of all les being randomized throughout.
E. Obfuscation
In a standard computing (windows) environment, a simple
form of camouage can be implemented by renaming les to

25
be of a different type; for example, storing an encrypted data

le as a .exe or .dll le. Some cryptosystems output les with
identiable extensions such as .enc which can then be simply
ltered by a rewall. Another example includes renaming les
in order to access data and/or execute an encryption engine.
For example, by storing an executable le as a .dll (dynamic
link library) le (which has a similar structure to a .exe le) in
a directory full of real .dll les associated with some complex
applications package, the encryption engine can be obfuscated,
especially if it has a name that is similar to the environment
of les in which it is placed. By renaming the le back to its
former self, execution of a cryptosystem can be undertaken
in the usual way. However, this requires that the executable le
is forensically inert, i.e. it does not contain data that reects
its purpose. A simple way of implementing this requirement is
to ensure that the source code (prior to compilation) is devoid
of any arrays, comments etc. that include references (through
use of named variables, for example) to the type of application
(e.g. comments such encrypt the data or named arrays such
as decrypt array[i]).
F. Steganographic Encryption
It is arguable that disinformation should, where possible,
be used in conjunction with the exchange of encrypted information which has been camouaged using steganographic
techniques for hiding the ciphertext. For example, suppose
that it had been known by Germany that the Enigma ciphers
were being compromised by the British during the Second
World War. Clearly, it would have then been strategically
advantageous for Germany to propagate disinformation using
Enigma. If, in addition, real information had been encrypted
differently and the ciphertexts camouaged using broadcasts
through the German home radio service, for example, then
the outcome of the war could have been very different. The
use of new encryption methods coupled with camouage and
disinformation, all of which are dynamic processes, provides a
model that, while not always of practical value, is strategically
comprehensive and has only rarely been fully realised. Nevertheless, some of the techniques that have been developed and
are reported in this work are the result of an attempt to realise
this model.
Fig. 1. Alice and Bob can place a message in a box which can be secured
using a combination lock and sent via a public network - the postal service,
for example.
Using a traditional paradigm, we consider the problem of

how Alice (A) and Bob (B) can pass a message to and from
each other without it being compromised or attacked by an
intercept. As illustrated in Figure 1, we consider a simple box
and combination lock scenario. Alice and Bob can write a
message, place it in the box, lock the box and then send it
through an open channel - the postal services, for example.
In cryptography, the strength of the box is analogous to the
strength of the cipher. If the box is weak enough to be
opened by brute force, then the strength of the lock is relatively
insignicant. This is analogous to a cipher whose statistical
properties are poor, for example, i.e. whose Probability Density Function (PDF) is narrow and whose information Entropy
is relatively low, with a similar value to the plaintext. The
strength of the lock is analogous to the strength of the key
in a real cryptographic system. This includes the size of the
combination number which is equivalent to the length of the
key that is used. Clearly a four rotor combination lock as
illustrated in Figure 1 represents a very weak key since the
number of ordered combinations required to attempt a brute
force attack to open the lock are relatively low, i.e. for a 4digit combination lock where each rotor has ten digits 0-9, the
number of possible combinations is 10000 (including 0000).
However, the box-and-lock paradigm being used here is for
illustrative purposes only.
A. Symmetric Encryption
III. BASIC C ONCEPTS

Irrespective of the wealth of computational techniques that
can be invented to encrypt data, there are some basic concepts that are a common theme in modern cryptography.
The application of these concepts typically involves the use
of random number generators and/or the use of algorithms
that originally evolved for the generation of random number
streams, algorithms that are dominated by two fundamental
and interrelated themes [4]-[6]: (i) the use of modular arithmetic; (ii) the application of prime numbers. The application
of prime numbers is absolutely fundamental to a large range
of encryption processes and international standards such as
PKI (Public Key Infrastructure) details of which are discussed
later.
Symmetric encryption is the simplest and most obvious

approach to Alice and Bob sending their messages. Alice and
Bob agree on a combination number a priori. Alice writes a
message, puts it in the box, locks it and sends it off. Upon
receipt, Bob unlocks the box using the combination number
that has been agreed and recovers the message. Similarly,
Bob can send a message to Alice using exactly the same
approach or protocol. Since this protocol is exactly the same
for Alice and Bob it has a symmetry and thus, encryption
methods that adopt this protocol are referred to as symmetric
encryption methods. Given that the box and the lock have
been designed to be strong, the principal weakness associated
with this method is its vulnerability to attack if a third party
obtains the combination number at the point when Alice and

26
Bob invent it and agree upon it. Thus, the principal problem
in symmetric encryption is how Alice and Bob exchange the
key. Irrespective of how strong the cipher and key are, unless
the key exchange problem can be solved in an appropriate and
a practicable way, symmetric encryption always suffers from
the same fundamental problem - key exchange!
If E denotes the encryption algorithm that is used which
depends upon a key K to encrypt plaintext P , then we can
consider the ciphertext C to be given by
C = EK (P ).
Decryption can then be denoted by the equation
P = EK (C).
Note that it is possible to encrypt a number of times using
different keys K1 , K2 , ... with the same encryption algorithm
to give a double encrypted cipher text
C = EK2 (EK1 (P ))
or a triple encrypted ciphertext
C = EK3 (EK2 (EK1 (P ))).
Decryption, is then undertaken using the same keys in the
reverse order to which they have been applied, i.e.
P = EK1 (EK2 (EK3 (C))).
Symmetric encryption systems, which are also referred to
as shared secret systems or private key systems, are usually
signicantly easier to use than systems that employ different
protocols (such as asymmetric encryption). However, the requirements and methods associated with key exchange sometimes make symmetric system difcult to use. Examples of
symmetric encryption systems include the Digital Encryption
Standard DES and DES3 (essentially, but not literally, the
Digital Encryption Standard with triple encryption) and the
Advanced Encryption Standard (AES). Symmetric systems are
commonly used in many banking and other nancial institutes
and in some military applications. A well known historical
example of a symmetric encryption engine, originally designed
for securing nancial transactions, and later used for military
communications, was the Enigma.
B. Asymmetric Ciphers
Instead of Alice and Bob agreeing on a combination number
a priori, suppose that Alice sets her lock to be open with a
combination number known only to her. If Bob then wishes to
send Alice a message, he can make a request for her to send
him an open lock. Bob can then write his message, place it
in the box which is then locked and sent on to Alice. Alice
can then unlock the box and recover the message using the
combination number known only to her. The point here is
that Bob does not need to know the combination number, he
only needs to receive an open lock from Alice. Of course
Bob can undertake exactly the same procedure in order to
receive a message from Alice. Clearly, the processes that are
undertaken by Alice and Bob in order to send and receive a
single message are not the same. The protocol is asymmetric
and we refer to encryption systems that use this protocol as

being asymmetric. Note that Alice could use this protocol to
receive messages from any number of senders provided they
can get access to one of her open locks. This can be achieved
by Alice distributing many such locks as required.
One of the principal weaknesses of this approach relates
to the lock being obtained by a third party whose interest
is in sending bogus or disinformation to Alice. The problem
for Alice is to nd a way of validating that a message sent
from Bob (or anyone else who is entitled to send messages
to her) is genuine, i.e. that the message is authentic. Thus,
data authentication becomes of particular importance when
implementing asymmetric encryption systems.
Asymmetric encryption relies on both parties having two
keys. The rst key (the public key) is shared publicly. The
second key is private, and is kept secret. When working
with asymmetric cryptography, the message is encrypted using
the recipients public key. The recipient then decrypts the
message using the private key. Because asymmetric ciphers
tend to be computationally intensive (compared to symmetric encryption), they are usually used in combination with
symmetric systems to implement public key cryptography.
Asymmetric encryption is often used to transfer a session key
rather than information proper - plaintext. This session key is
then used to encrypt information using a symmetric encryption
system. This gives the key exchange benets of asymmetric
encryption with the speed of symmetric encryption. A well
known example of asymmetric encryption - also known as
public key cryptography - is the RSA algorithm which is
discussed later. This algorithm uses specic prime numbers
(from which the private and public keys are composed) in
order to realize the protocol.
In order to provide users with appropriate prime numbers, an
infrastructure needs to be established by a third party whose
business is to distribute the public/private key pairs. This
infrastructure is known as the Public Key Infrastructure or
PKI. The use of a public key is convenient for those who wish
to communicate with more than one individual and is thus, a
many-to-one protocol that avoids multiple key-exchange. On
the other hand, a public key provides a basis for cryptanalysis.
Given that C = EK (P ) where K is the public key, the analyst
can guess P and check the answer by comparing C with the
intercepted ciphertext, a guess that is made easier if it is based
on a known Crib - i.e. information that can be assumed to
be a likely component of the plaintext. Public key algorithms
are therefore often designed to resist chosen-plaintext attack.
Nevertheless, analysis of public key and asymmetric systems
in general, reveals that the level of security is not as signicant
as that which can be achieved using a well-designed symmetric
system. One obvious and fundamental issue relates to the third
party responsible for the PKI and how much trust should
be assumed, especially with regard to legislation concerning
issues associated with the use of encrypted material.
C. Three-Way Pass Protocol
The three-way pass protocol, at rst sight, provides a
solution to the weaknesses associated with symmetric and

27
asymmetric encryption. Suppose that Alice writes a message,

puts it in the box, locks the box with a lock whose combination
number is known only to her and sends it onto Bob. Upon
receipt Bob cannot open the box, so Bob locks the box with
another lock whose combination number is known only to
himself and sends it back to Alice. Upon receipt, Alice can
remove her lock and send the box back to Bob (secured with
his lock only) who is then able to remove his lock and recover
the message. Note that by using this protocol, Alice and Bob
do not need to agree upon a combination number; this avoids
the weakness of symmetric encryption. Further, Alice and Bob
do not need to send each other open locks which is a weakness
of asymmetric encryption.
The problem with this protocol relates to the fact that it
requires the message (secured in the locked box) to be exchanged three times. To explain this, suppose we have plaintext
in the form of an American Standard Code for Information
Interchange or ASCII-value array p[i] say. Alice generates a
cipher n1 [i] using some appropriate strength random number
generator and an initial condition based on some long integer
- the key. Let the ciphertext c[i] be generated by adding the
cipher to the plaintext, i.e.
c1 [i] = p[i] + n1 [i]
which is transmitted to Bob. This is a substitution-based
encryption process and is equivalent to Alice securing the
message in the box with her lock - the rst pass. Bob generates
a new cipher n2 [i] using the same (or possibly a different)
random number generator with a different key and generates
the ciphertext
c2 [i] = c1 [i] + n2 [i] = p[i] + n1 [i] + n2 [i]
which is transmitted back to Alice - the second pass. Alice
now uses her cipher to generate
c3 [i] = c2 [i] n1 [i] = p[i] + n2 [i]
which is equivalent to her taking off her lock from the box
and sending the result back to Bob - the third pass. Bob then
uses his cipher to recover the message, i.e.
c3 [i] n2 [i] = p[i].
However, suppose that the cipher texts c1 , c2 and c3 are
intercepted, then the plaintext array can be recovered since
p[i] = c3 [i] + c1 [i] c2 [i].
This is the case for any encryption process that is commutative
and associative. For example, if the arrays are considered to
be bit streams and the encryption process undertaken using
the XOR process (denoted by ), then
c1 = n1 p,
c2 = n2 c1 = n2 n1 p,
c3 = n1 c2 = n2 p
This is because for any bit stream a, b and c

aab=b
and because the XOR operation is both commutative and
associative i.e.
ab=ba
and
a (b c) = (a b) c.
These properties are equivalent to the fact that when Alice
receives the box at the second pass with both locks on it, she
can, in principle, remove the locks in any order. If, however,
she had to remove Bobs lock before her own, then the protocol
would become redundant.
D. Private Key Encryption
One of the principal goals in private key cryptography is
to design Pseudo Random Number Generators (PRNGs) that
provide outputs (random number streams) where no element
can be predicted from the preceding elements given complete
knowledge of the algorithm. Another important feature is to
produce generators that have long cycle lengths. A further
important feature, is to ensure that the Entropy of the random
number sequence is a maximum, i.e. that the histogram of the
number stream is uniform.
The use of modular integer arithmetic coupled with the use
of prime numbers in the development of encryption algorithms
tends to provide functions which are not invertible. They are
one-way functions that can only be used to reproduce a specic
(random) sequence of numbers from the same initial condition.
The basic idea in stream ciphers - as used for private key
(symmetric) cryptography - is to convert a plaintext into a
ciphertext using a key that is used as a seed for the PRNG. A
plaintext le is converted to a stream of integer numbers using
ASCII conversion. For example, suppose we wish to encrypt
the authors surname Blackledge for which the ASCII5 decimal
integer stream or vector is
p = (66, 108, 97, 99, 107, 108, 101, 100, 103, 101).
Suppose we now use the linear congruential PRNG dened
by6
ni+1 = ani modP
where a = 13, P = 131 and let the seed be 250659, i.e.
n0 = 250659. The output of this iteration is
n = (73, 32, 23, 37, 88, 96, 69, 111, 2, 26).
If we now add the two vectors together, we generate the cipher
stream
c = f +n = (139, 140, 120, 136, 195, 204, 170, 211, 105, 127).
Clearly, provided the recipient of this number stream has
access to the same algorithm (including the values of the
parameters a and P ) and crucially, to the same seed n0 ,
5 Any
and
c1 c2 c3 = p.
code can be used.

a PRNG is not suitable for cryptography and is being used for
illustrative purposes only.
6 Such

28
the vector n can be regenerated and p obtained from c by

subtracting n from c. However, in most cryptographic systems,
this process is usually accomplished using binary streams
where the binary stream representation of the plaintext p and
that of the random number stream or cipher n are used to
generate the ciphertext binary stream c via the process
c = n f.
Restoration of the plaintext is then accomplished via the
operation
f = n c = n n f.
The processes discussed above are examples of digital
confusion in which the information contained in the eld f
(the plaintext) is confused using a stochastic function c (the
cipher) via addition (decimal integer process) or with an XOR
operator (binary process). Here, the seed plays the part of a key
that it utilized for the process of encryption and decryption.
This is an example of symmetric encryption in which the key
is a private key known only to the sender and recipient of the
encrypted message.
Given that the algorithm used to generate the random
number stream has public access (together with the parameters
it uses which are typically hard-wired in order to provide a
random eld pattern with a long cycle length), the problem
is how to securely exchange the key to the recipient of the
encrypted message so that decryption can take place. If the
key is particular to a specic communication and is used once
and once only for this communication (other communications
being encrypted using different keys), then the process is
known as a one-time pad, because the key is only used once.
Simple though it is, this process is not open to attack. In
other words, no form of cryptanalysis will provide a way of
deciphering the encrypted message. The problem is how to
exchange the keys in a way that is secure and thus, solutions
to the key exchange problem are paramount in symmetric
encryption,
The illustration of stream cipher encryption given above
highlights the problem of key exchange, i.e. providing the
value of n0 to both sender and receiver. In addition to
developing the technology for symmetric encryption (e.g. the
algorithm or algorithms), it is imperative to develop appropriate protocols and procedures for using it effectively with
the aim of reducing inevitable human error, the underlying
principles being: (i) the elimination of any form of temporal
correlation in the used algorithm; (ii) the generation of a key
that is non-intuitive and at best random; (iii) the exchange of
the key once it has been established.
E. Public-Private Key Encryption
Public-Private Key Encryption [36]-[40] is fundamentally
asymmetric and in terms of the box and combination-lock
paradigm is based on considering a lock which has two
combinations, one to open the lock and another to lock it. The
second constraint is the essential feature because one of the
basic assumptions in the use of combination locks is that they
can be locked irrespective of the rotor positions. Thus, after
writing a message, Alice uses one of Bobs specially designed
locks to lock the box using a combination number that is

unique to Bob but is openly accessible to Alice and others
who want to send Bob a message. This combination number
is equivalent to the public key. Upon reception, Bob can open
the lock using a combination number that is known only to
himself - equivalent to a private key. However, to design such
a lock, there must be some mechanical property linking
the combination numbers required to rst lock it and then
unlock it. It is this property that is the principal vulnerability
associated with public/private key encryption, a property that
is concerned with certain precise and exact relationships that
are unique to the use of prime numbers and their applications
with regard to generating pseudo random number streams and
stochastic functions in general [41].
The most common example of a public-private key encryption algorithm is the RSA algorithm [39] which gets its name
after the three inventors, Rivest, Shamir and Adleman who
developed the generator in the mid 1970s7 . It has since withstood years of extensive cryptanalysis. To date, cryptanalysis
has neither proved nor disproved the security of the algorithm
in a complete and self-consistent form which suggests a high
condence level in the algorithm.
The basic generator is given by
ni+1 = ne mod(pq)
i
where p, q and e are prime numbers and e < pq. Although this
generator can be used to compute a pseudo random number
stream ni , the real value of the algorithm lies in its use for
transforming plaintext Pi (taken to be a decimal integer array
based on ASCII 7-bit code, for example) to ciphertext Ci
directly via the equation
Ci = Pie mod(pq).
We then consider the decryption process to be based on the
same type of transform, i.e.
d
Pi = Ci mod(pq).
The problem is then to nd d given e, p and q. The key to

solving this problem is to note that if ed 1 is divisible by
(p 1)(q 1), i.e. d is given by the solution of
de = mod[(p 1)(q 1)]
then
d
Ci mod(pq) = Pied mod(pq) = Pi mod(pq)
using Fermats Little Theorem, i.e. for any integer a and prime
number p
ap = amodp.
Note that this result is strictly dependent on the fact that ed1
is divisible by (p 1)(q 1) making e a relative prime of
(p 1)(q 1) so that e and (p 1)(q 1) have no common
factors except for 1. This algorithm, is the basis for many
public/private or asymmetric encryption methods. Here, the
7 There are some claims that the method was rst developed at GCHQ in
England and then re-invented (or otherwise) by Rivest, Shamir and Adleman
in the USA; the method was not published openly by GCHQ - such are the
realities of keeping it quiet.

29
public key is given by the number e and the product pq which

are unique to a given recipient and in the public domain (like
an individuals telephone number). Note that the prime numbers
p and q and the number e < pq must be distributed to Alice
and Bob in such a way that they are unique to Alice and Bob
on the condition that d exists! This requires an appropriate
infrastructure to be established by a trusted third party whos
business is to distribute values of e, pq and d to its clients
a Public Key Infrastructure. A PKI is required in order to
distribute public keys, i.e. different but appropriate values of
e and pq for use in public key cryptography (RSA algorithm).
This requires the establishment of appropriate authorities and
directory services for the generation, management and certication of public keys.
Recovering the plaintext from the public key and the cipher
text can be conjectured to be equivalent to factoring the
product of the two primes. The success of the system, which is
one of the oldest and most popular public key cryptosystems, is
based on the difculty of factoring. The principal vulnerability
of the RSA algorithm with regard to an attack is that e and pq
are known and that p and q must be prime numbers - elements
of a large but (assumed) known set. To attack the cipher, d
must be found. But it is known that d is the solution of
de = mod[(p 1)(q 1)]
which is only solvable if e < pq is a relative prime of
(p1)(q1). An attack can therefore be launched by searching
through prime numbers whose magnitudes are consistent with
the product pq (which provides a search domain) until the
relative prime condition is established for factors p and q.
However, factoring pq to calculate d given e is not trivial. It
is possible to attack an RSA cipher by guessing the value of
(p 1)(q 1) but this is no easier than factoring pq which
is the most obvious means of attack. It is also possible for
a cryptanalyst to try every possible d but this brute force
approach is less efcient than trying to factor pq.
In general, RSA cryptanalysis (see Section IV) has shown
that the attacks discovered to date illustrate the pitfalls to be
avoided when implementing RSA. Thus, even though RSA
ciphers can be attacked, the algorithm can still be considered
secure when used properly. In order to ensure the continued
strength of the cipher, RSA run factoring challenges on their
websites. As with all PKI and other cryptographic products,
this algorithm is possibly most vulnerable to authorities (at
least those operating in the UK) having to conform to the
Regulation of Investigatory Powers Act 2000, Section 49.
IV. C RYPTANALYSIS
Any cryptographic system must be able to withstand cryptanalysis [42]. Cryptanalysis methods depend critically on the
encryption techniques which have been developed and are,
therefore, subject to delays in publication. Cryptanalysts work
on attacks to try and break a cryptosystem. In many cases,
the cryptanalysts are aware of the algorithm used and will
attempt to break the algorithm in order to compromise the
keys or gain access to the actual plaintext. It is worth noting
that even though a number of algorithms are freely published,
this does not in any way mean that they are the most secure.
Most government institutions and the military do not reveal
the type of algorithm used in the design of a cryptosystem.
The rationale for this is that, if we nd it difcult to break a
code with knowledge of the algorithm, then how much more
difcult is it to break a code if the algorithm is unknown?
On the other hand, within the academic community, security
in terms of algorithm secrecy is not considered to be of
high merit and publication of the algorithm(s) is always
recommended. It remains to be understood whether this is
a misconception within the academic world (due in part to
the innocence associated with academic culture) or a covertly
induced government policy (by those who are less innocent!).
For example, in 2003, it was reported that the Americans had
broken ciphers used by the Iranian intelligence services. What
was not mentioned, was the fact that the Iranian ciphers were
based on systems purchased indirectly from the USA and thus,
based on USA designed algorithms [45].
The known algorithm approach originally comes from the
work of Auguste Kerchhoff. Kerchhoffs Principle states that:
A cryptosystem should be secure even if everything about the
system, except the key, is public knowledge. This principle
was reformulated by Claude Shannon as the enemy knows
the system and is widely embraced by cryptographers world
wide. In accordance with the Kerchhoff-Shannon principle,
the majority of civilian cryptosystems make use of publicly
known algorithms. The principle is that of security through
transparency in which open-source software is considered to
be inherently more secure than closed source software. On
this basis there are several methods by which a system can be
attacked where, in each case, it is assumed that the cryptanalyst
has full knowledge of the algorithm(s) used.
A. Basic Attacks
We provide a brief overview of the basic attack strategies
associated with cryptanalysis.
Ciphertext-only attack is where the cryptanalyst has a
ciphertext of several messages at their disposal. All messages
are assumed to have been encrypted using the same algorithm.
The challenge for the cryptanalyst is to try and recover the
plaintext from these messages. Clearly a cryptanalyst will be
in a valuable position if they can recover the actual keys used
for encryption.
Known-plaintext attack makes the task of the cryptanalysis
simpler because, in this case, access is available to both the
plaintext and the corresponding ciphertext. It is then necessary
to deduce the key used for encrypting the messages, or design
an algorithm to decrypt any new messages encrypted with
the same key.
Chosen-plaintext attack involves the cryptanalyst possessing
both the plaintext and the ciphertext. In addition, the analyst
has the ability to encrypt plaintext and see the ciphertext
produced. This provides a powerful tool from which the keys
can be deduced.
30

Adaptive-chosen-plaintext attack is an improved version of

the chosen-plaintext attack. In this version, the cryptanalyst
has the ability to modify the results based on the previous
encryption. This version allows the cryptanalyst to choose a
smaller block for encryption.
Chosen-ciphertext attack can be applied when the
cryptanalyst has access to several decrypted texts. In addition,
the cryptanalyst is able to use the text and pass it though a
black box for an attempted decrypt. The cryptanalyst has to
guess the keys in order to use this method which is performed
on an iterative basis (for different keys), until a decrypt is
obtained.
Chosen-key attack is based on some knowledge on
the relationship between different keys and is not of practical
signicance except in special circumstances.
Rubber-hose cryptanalysis is based on the use of human
factors such as blackmail and physical threat for example.
It is often a very powerful attack and sometimes very effective.
Differential cryptanalysis is a more general form of
cryptanalysis. It is the study of how differences in an input
can affect differences in the output. This method of attack is
usually based on a chosen plaintext, meaning that the attacker
must be able to obtain encrypted ciphertexts for some set
of plaintexts of their own choosing. This typically involves
acquiring a Crib of some type as discussed in the following
section.
Linear cryptanalysis is a known plaintext attack which
uses linear relations between inputs and outputs of an
encryption algorithm which holds with a certain probability.
This approximation can be used to assign probabilities to the
possible keys and locate the one that is most probable.
B. Cribs
The problem with any form of chosen-plaintext attack is,
of course, how to obtain part or all of the plaintext in the
rst place. One method that can be used is to obtain a Crib.
A Crib, a term that originated at Bletchley Park during the
Second World War, is a plaintext which is known or suspected
of being part of a ciphertext. If it is possible to compare
part of the ciphertext that is known to correspond with the
plaintext then, with the encryption algorithm known, one can
attempt to identify which key has been used to generate the
cipherext as a whole and thus decrypt an entire message. But
how is it possible to obtain any plaintext on the assumption
that all plaintexts are encrypted in their entirety? One way
is to analyse whether or not there is any bad practice being
undertaken by the user, e.g. sending stereotyped (encrypted)
messages. Analysing any repetitive features that can be expected is another way of obtaining a Crib. For example,
suppose that a user was writing letters using Microsoft word,
for example, having established an electronic letter template
with his/her name, address, company reference number etc.

Suppose we assume that each time a new letter is written, the
entire document is encrypted using a known algorithm. If it
is possible to obtain the letter template then a Crib has been
found. Assuming that the user is not prepared to share the
electronic template (which would be a strange thing to ask
for), a simple way of obtaining the Crib could be to write to
the user in hardcopy and ask that the response from the same
user is of the same type, pleading ignorance of all forms of
ICT or some other excuse. This is typical of methods that
are designed to seed a response that includes a useful Crib.
Further, there are a number of passive cribs with regard to
letter writing that can be assumed, the use of Dear and Yours
sincerely, for example.
During the Second World War, when passive cribs such
as daily weather reports became rare through improvements
in the protocols associated with the use of Enigma and/or
operators who took their work seriously, Bletchley Park would
ask the Royal Air Force to create some trouble that was of
little military value. This included seeding a particular area
in the North Sea with mines, dropping some bombs on the
French coast or, for a more rapid response, asking ghter pilots
to go over to France and shoot up targets of opportunity8 ,
processes that came to be known as gardening. The Enigma
encrypted ciphertexts that were used to report the trouble
could then be assumed to contain information such as the
name of the area where the mines had been dropped and/or the
harbour(s) threatened by the mines. It is worth noting that the
ability to obtain cribs by gardening was made relatively easy
because of the war in which trouble was to be expected and
to be reported. Coupled with the efciency of the German war
machine with regard to its emphasis on accurate and timely
reports, the British were in a privileged position in which they
could create cribs at will and have some fun doing it!
When a captured and interrogated German stated that
Enigma operators had been instructed to encode numbers by
spelling them out, Alan Turing reviewed decrypted messages,
and determined that the number eins appeared in 90% of
the messages. He automated the crib process, creating an
Eins Catalogue, which assumed that eins was encoded at
all positions in the plaintext. The catalogue included every
possible key setting which provided a very simple and effective
way of recovering the key and is a good example of how the
statistics (of a word or phrase) can be used in cryptanalysis.
The use of Enigma by the German naval forces (in particular, the U-boat eet) was, compared to the German army
and air force, made secure by using a password from one
day to the next. This was based on a code book provided
to the operator prior to departure from base. No transmission
of the daily passwords was required, passive cribs were rare
and seeding activities were difcult to arrange. Thus, if not
for a lucky break, in which one of these code books (which
were printed in ink that disappeared if they were dropped in
seawater) was recovered in tact by a British destroyer (HMS
Bulldog) from a damaged U-boat (U-110) on May 9, 1941,
8 Using targets of opportunity became very popular towards the end of the
war. Fighter pilots were encouraged to, in the words of General J Dolittle,
get them in the air, get them on the ground, just get them.
31

breaking the Enigma naval transmissions under their timevariant code-book protocol would have been very difcult. A
British Naval message dated May 10, 1941 reads: 1. Capture
of U Boat 110 is to be referred to as operation Primrose;
2. Operation Primrose is to be treated with greatest secrecy
and as few people allowed to know as possible... Clearly,
and for obvious reasons, the British were anxious to make
sure that the Germans did not nd out that U-110 and its
codebooks had been captured and all the sailors who took part
in the operation were sworn to secrecy. On HMS Bulldogs
arrival back in Britain a representative from Bletchley Park,
photographed every page of every book. The interesting piece
of equipment turned out to be an Enigma machine, and the
books contained the Enigma codes being used by the German
navy.
The U-boat losses that increased signicantly through the
decryption of U-boat Enigma ciphers led Admiral Carl Doenitz
to suspect that his communications protocol had been compromised. He had no rm evidence, just a gut feeling that
something was wrong. His mistake was not to do anything
about it9 , an attitude that was typical of the German High
Command who were certiable with regard to their condence
in the Enigma system. However, they were not uniquely certiable. For example, on April 18, 1943, Admiral Yamamoto
(the victor of Pearl Harbour) was killed when his plane was
shot down while he was attempting to visit his forces in the
Solomon Islands. Notication of his visit from Rabaul to the
Solomons was broadcast as Morse coded ciphertext over the
radio, information that was being routinely decrypted by the
Americans. At this point in the Pacic War, the Japanese
were using a code book protocol similar to that used by the
German Navy, in which the keys were changed on a daily
basis, keys that the Americans had generated copies of. Some
weeks before his visit, Yamamoto had been given the option
of ordering a new set of code books to be issued. He had
refused to give the order on the grounds that the logistics
associated with transferring new code books over Japanese
held territory was incompatible with the time scale of his visit
and the possible breach of security that could arise through a
new code book being delivered into the hands of the enemy.
This decision cost him his life. However, it is a decision
that reects the problems associated with the distribution of
keys for symmetric cryptosystems especially when a multiuser protocol needs to be established for execution over a wide
communications area. In light of this problem, Yamamotos
decision was entirely rational but, nevertheless, a decision
based on the assumption that the cryptosystem had not already
been compromised. Perhaps it was his faith in the system and
thereby his refusal to think the unthinkable that cost him his
life!
The principles associated with cryptanalysis that have been
briey introduced here illustrate the importance of using a
dynamic approach to cryptology. Any feature of a security
infrastructure that has any degree of consistency is vulnerable
to attack. This can include plaintexts that have routine phrases
such as those used in letters, the key(s) used to encrypt the
9 An
instinct can be worth a thousand ciphers, ten-thousand if you like.
plaintext and the algorithm(s) used for encryption. One of the

principal advantages of using chaoticity for designing ciphers
is that it provides the cryptographer with a limitless and dynamic resource for producing different encryption algorithms.
These algorithms can be randomly selected and permuted to
produce, in principle, an unlimited number of Meta encryption
engines that operate on random length blocks of plaintext. The
use of block cipher encryption is both necessary in order to
accommodate the relatively low cycle length of chaotic ciphers
and desirable in order to increase the strength of the cipher
by implementing a multi-algorithmic approach. Whereas in
conventional cryptography, emphasis focuses on the number of
permutations associated with the keys used to seed or drive
an algorithm, chaos-based encryption can focus on the number
of permutations associated with the algorithms that are used,
algorithms that can, with care and understanding, be quite
literally invented on the y. Since cryptanalysis requires that
the algorithm is known and concentrates on trying to nd
the key, this approach, coupled with other important details
that are discussed later on in this paper, provides a method
that can signicantly enhance the cryptographic strength of
the ciphertext. Further, in order to satisfy the innocence of
the academic community, it is, of course, possible to openly
publish such algorithms (as in this paper, for example), but in
the knowledge that many more can be invented and published
or otherwise. This provides the potential for generating a
host of home-spun ciphers which can be designed and
implemented by anyone who wishes to by-pass established
practices and cook it themselves.
V. D IFFUSION AND C ONFUSION
In terms of plaintexts, diffusion is concerned with the issue
that, at least on a statistical basis, similar plaintexts should
result in completely different ciphertexts even when encrypted
with the same key. This requires that any element of the
input block inuences every element of the output block in
an irregular fashion. In terms of a key, diffusion ensures that
similar keys result in completely different ciphertexts even
when used for encrypting the same block of plaintext. This
requires that any element of the input should inuence every
element of the output in an irregular way. This property must
also be valid for the decryption process because otherwise an
intruder may be able to recover parts of the input from an
observed output by a partly correct guess of the key used for
encryption. The diffusion process is a function of sensitivity
to initial conditions conditions that a cryptographic system
should have and further, the inherent topological transitivity
that the system should also exhibit causing the plaintext to be
mixed through the action of the encryption process.
Confusion ensures that the (statistical) properties of plaintext blocks are not reected in the corresponding ciphertext
blocks. Instead every ciphertext must have a random appearance to any observer and be quantiable through appropriate statistical tests. However, diffusion and confusion are
processes that are of fundamental importance in the design and
analysis of cryptological systems, not only for the encryption
of plaintexts but for data transformation in general.

32
A. The Diffusion Equation

In a variety of physical problems, the process of diffusion
can be modelled in terms of certain solutions to the diffusion
equation whose basic (linear) form is given by10 [44]-[47]
2
u(r, t) = S(r, t),

t
2
2
2
+ 2 + 2,
=
2
x
y
z
u(r, t)
2
1
,
D
where D is the Diffusivity, S is a source term and u
is a function which describes physical properties such as
temperature, light, particle concentration and so on with initial
value u0 .
The diffusion equation describes elds u that are the result
of an ensemble of incoherent random walk processes, i.e.
walks whose direction changes arbitrarily from one step to the
next and where the most likely position after a time t is pro
portional to t. Further, the diffusion equation differentiates
between past and future, i.e. t . This is because the diffusing
eld u represents the behaviour of some average property of
an ensemble of many agents which cannot in general go back
to their original state. This fundamental property of diffusive
processes has a synergy with the use of one-way functions
in cryptology, i.e. functions that, given an input, produce an
output that is not reversible - an output from which it is not
possible to compute the input.
Consider the process of diffusion in which a source of
material diffuses into a surrounding homogeneous medium,
the material being described by some initial condition u(r, 0).
Physically, it is to be expected that the material will increasingly spread out as time evolves and that the concentration
of the material decreases further away from the source. The
general solution to the diffusion equation yields a result in
which the spatial concentration of material is given by the
convolution of the initial condition with a Gaussian function.
This solution is determined by considering how the process of
diffusion responds to a single point source which yields the
Greens function (in this case, a Gaussian function) given by
[47]-[49],
Fig. 2. Image of an optical source (left), the same source imaged through
steam (centre) and a simulation of this effect obtained by convolving the
source image with a Gaussian Point Spread Function (right).
u(r, 0) = u0 (r), =
G(r, t) =
4t
n
2
exp
r2
4t
, t > 0, r =| r | .
which is the solution to

2
u(r, t) = n (r)(t)
where denotes the Dirac delta function [50], [51] and n =

1, 2, 3 determines the dimension of the solution.
In the innite domain, the general solution to the diffusion
equation can be written in the form [47]
u(r, t) = G(r, t) r t S(r, t) + G(r, t) r u(r, 0)
which requires that the spatial extent of the source function is
innite but can include functions that are localised provided
10 r
= xx + yy + zz denotes the spatial vector and t denotes time
that S 0 as r - a Gaussian function for example.

The solution is composed of two terms. The rst term is
the convolution (in space and time, denoted by r and t
respectively) of the source function with the Greens function
and the second term is the convolution (in space only) of the
initial condition u(r, 0) with the same Greens function where
G(r, t) r t S(r, t) =
G(| r r |, t )S(r , )d3 r d.
Thus, for example, in two-dimensions, for the case when S =

0, and ignoring scaling by /(4t)), the solution for u is
u(x0 , y0 , t) = exp
2
(x + y 2 ) u0 (x, y)
4t
where we have introduced to denote the two-dimensional

convolution integral. Here, the eld at time t > 0 is given by
the eld at time t = 0 convolved with the two-dimensional
Gaussian function exp[(x2 + y 2 )/(4t)]. This result can,
for example, be used to model the diffusion of light through
an optical diffuser. An example of such an effect is given in
Figure 2 which shows a light source (the ceiling light of a
steam room) imaged through air and through steam together
with a simulation. Steam, as composed of a complex of small
water droplets, effects light by scattering it a large number of
times. The high degree of multiple scattering that takes place
allows us to model the transmission of light in terms of a
diffusive rather than a propagative processe where the function
u is taken to denote the intensity of light. The initial condition
u0 is taken to denote the initial image which is, in effect, the
image of the light source recorded through air. As observed
in Figure 2, the details associated with the light source are
blurred through a mixing process which is determined by the
Gaussian function that is characteristic of diffusion processes
in general. In imaging science, functions of this type determine
how a point of light is affected by the convolution process11
and is thus referred to as the Point Spread Function or PSF
[52]. The PSF is a particularly important characteristic of any
imaging system in general, a characteristic that is related to
the physical processes through which light is transforms from
the object plane (input) to the image plane (output).
If we record a diffused eld u after some time t = T , is
it possible to reconstruct the eld at time t = 0, i.e. to solve
the inverse problem or de-diffuse the eld measured? We can
11 Convolution is sometimes referred to by its German equivalent, i.e. by
the word Faltung which means mixing or diffusing.

33
express u(r, 0) in terms of u(r, T ) using the Taylor series
u0 (r) u(r, 0) = u(r, T ) +
(1)n n n
u(r, t)
T
n!
tn
n=1
.
t=T
From the diffusion equation

2u
u
=D 2
= D2 4 u,
2
t
t
2u
3u
= D 2 2 = D3 6 u
t3
t
and so on. Thus, in general we can write
n
u(r, t)
tn
=D
2n
Fig. 3. Progressive diffusion and confusion of an image (top-left) - from

left to right and from top to bottom - for uniform distributed noise. The
convolution is undertaken using the convolution theorem and a Fast Fourier
Transform (FFT)
u(x, y, T ).
t=T
and after substituting this result into the series for u0 given
above, we obtain
u0 (r) = u(r, T ) +
(1)n
(DT )n
n!
n=1
u DT
2n
u(r, T )
u, DT << 1.
B. Diffusion of a Stochastic Source

For the case when
2
u(r, t) = S(r, t), u(r, 0) = 0
t
the solution is
u(r, t) = G(r, t) r t S(r, t), t > 0.
If a source is introduced in terms of an impulse at t = 0, then
the system will react accordingly and diffuse for t > 0. This
is equivalent to introducing a source function of the form
S(r, t) = s(r)(t).
Output=Diffusion+Confusion.
The solution is then given by

u(r, t) = G(r, t) r s(r), t > 0.
Observe that this solution is of the same form as the homogeneous case with initial condition u(r, 0) = u0 (r) and that the
solution for initial condition u(r, 0) = u0 (r) is given by
u(r, t) = G(r, t) r [s(r) + u0 (r)] = G(r, t) u0 (r) + n(r, t)
where
n(r, t) = G(r, t) r s(r), t > 0.
Note that if s is a stochastic function (i.e. a random dependent variable characterised, at least, by a Probability Density
Function (PDF) denoted by Pr[s(r)]), then n will also be
a stochastic function. Also note that for a time-independent
source function S(r),
u0 (r) = u(r, T )
(1)n
[(DT )n
n!
n=1
2n
u(r, T ) + D1
given Pr[S(r)]). In other words, any error or noise associated

with diffusion leads to the process being irreversible - a oneway process. This, however, depends on the magnitude of
the diffusivity D which for large values cancels out the effect
of any noise, thus making the process potentially reversible.
In cryptography, it is therefore important that the process of
diffusion applied (in order that a key affects every bit of the
plaintext irrespective of the encryption algorithm that is used)
has a low diffusivity.
The inclusion of a stochastic source function provides us
with a self-consistent introduction to another important concept in cryptology, namely confusion. Taking, for example,
the two-dimensional case, the eld u is given by
u(x, y) =
exp (x2 + y 2 ) u0 (x, y) + n(x, y).
4t
4t
We thus arrive at a basic model for the process of diffusion
and confusion, namely
2n2
S(r)]
and that if S is a stochastic function, then the eld u can not

be de-diffused (since it is not possible to evaluate u0 exactly
Here, diffusion involves the mixing of the initial condition

with a Gaussian function and confusion is compounded in
the addition of a stochastic or noise function to the diffused
output. The relative magnitudes of the two terms determines
the dominating effect. As the noise function n increases in
amplitude relative to the diffusion term, the output will become
increasingly determined by the effect of confusion alone. In
the equation above, this will occur as t increases since the
magnitude of the diffusion term depends of the scaling factor
1/t. This is illustrated in Figure 3 which shows the combined
effect of diffusion and confusion for an image of the phrase
Confusion
+
Diffusion
as it is (from left to right and from top to bottom) progressively
diffused (increasing values of t) and increasingly confused for
a stochastic function n that is uniformly distributed. Clearly,
the longer the time taken for the process of diffusion to occur,
the more the output is confusion dominated. This is consistent
with all cases when the level of confusion is high and when
the stochastic eld used to generate this level of confusion

34
is unknown (other than possible knowledge of its PDF).

However, if the stochastic function has been synthesized12 and
is thus known a priori, then we can compute
1
exp (x2 + y 2 ) u0 (x, y)

4t
4t
from which u0 may be computed approximately via application of a deconvolution algorithm [52].
These results can derived using the Characteristic Function

[57]. For a strictly continuous random variable f (t) with distribution function Pf (x) = Pr[f (t)] we dene the expectation
as
u(x, y) n(x, y) =
VI. S TOCHASTIC F IELDS

By considering the diffusion equation for a stochastic
source, we have derived a basic model for the solution
eld or output u(r, t) in terms of the initial condition or
input u0 (r). We now consider the principal properties of
stochastic elds, considering the case where the elds are
random variables that are functions of time t.
E(f ) =
xPf (x)dx,
which computes the mean value of the random variable, the

Moment Generating Function as
E[exp(kf )] =
exp(kx)Pf (x)dx
which may not always exist and the Characteristic Function

as
E[exp(ikf )] =
A. Independent Random Variables

Two random variables f1 (t) and f2 (t) are independent if
their cross-correlation function is zero, i.e.
f1 (t + )f2 ( )d = f1 (t)
f2 (t) = 0
where is used to denote the correlation integral above. From

the correlation theorem [53], [54], it then follows that
exp(ikx)Pf (x)dx
which will always exist. Observe that the moment generating

function is the (two-sided) Laplace transform [58] of Pf and
the Characteristic Function is the Fourier transform of Pf .
Thus, if f (t) is a stochastic function which is the sum of
N independent random variables f1 (t), f2 (t), ..., fN (t) with
distributions Pf1 (x), Pf2 (x), ..., PfN (x), then
f (t) = f1 (t) + f2 (t) + ... + fN (t)
F1 ()F2 () = 0
and
where
F1 () =
E[exp(ikf )] = E[exp[ik(f1 + f2 + ... + fN )]
f1 (t) exp(it)dt
= E[exp(ikf1 )]E[exp(ikf2 )]...E[exp(ikfN )]
and
F2 () =
= F1 [Pf1 ]F1 [Pf2 ]...F1 [PfN ]
f2 (t) exp(it)dt.
If each function has a PDF Pr[f1 (t)] and Pr[f2 (t)] respectively, the PDF of the function f (t) that is the sum of f1 (t) and
f2 (t) is given by the convolution of Pr[f1 (t)] and Pr[f2 (t)],
i.e. the PDF of the function
f (t) = f1 (t) + f2 (t)
is given by [55], [56]
Pr[f (t)] = Pr[f1 (t)] Pr[f2 (t)].
Further, for a number of statistically independent
stochastic functions f1 (t), f2 (t), ..., each with a PDF
Pr[f1 (t)], Pr[f2 (t)], ..., the PDF of the sum of these
functions, i.e.
f (t) = f1 (t) + f2 (t) + f3 (t) + ...
where F1 is the one-dimensional Fourier transform operartor

dened as
In other words, the Characteristic Function of the random

variable f (t) is the product of the Characteristic Functions for
all random variables whose sum if f (t). Using the convolution
theorem for Fourier transforms, we then obtain
N
Pf (x) =
12 The
synthesis of stochastic functions is a principal issue in cryptology.
Pfi (x) = Pf1 (x) Pf2 (x) ... PfN (x).
i=1
Further, we note that if f1 , f2 , ..., fN are all identically

distributed then
N
E[exp[ik(f1 + f2 + ... + fN )] = (F[Pf1 ])
is given by
Pr[f (t)] = Pr[f1 (t)] Pr[f2 (t)] Pr[f1 (t)] ...
dx exp(ikx).
and
Pf (x) = Pf1 (x) Pf1 (x) ...

35
B. The Central Limit Theorem

The Central Limit Theorem stems from the result that
the convolution of two functions generally yields a function
which is smoother than either of the functions that are being
convolved. Moreover, if the convolution operation is repeated,
then the result starts to look more and more like a Gaussian
function - a normal distribution - at least in an approximate
sense [59], [60]. For example, suppose we have a number of
independent random variables each of which is characterised
by a distribution that is uniform. As we add more and more of
these functions together, the resulting distribution is then given
by convolving more and more of these (uniform) distributions.
As the number of convolutions increases, the result tends to
a Gaussian distribution. If we consider the effect of applying
multiple convolutions of the uniform distribution
P (x) =
1
X,
0,
| x | X/2;
otherwise
then be considering the effect of multiple convolutions in

Fourier space (through application of the convolution theorem)
and working with a series representation of the result, it can
be shown that (see Appendix I)
N
Pi (x) P1 (x) P2 (x) ... PN (t)
i=1
Fig. 4. Illustration of the Central Limit Theorem. The top-left image shows
plots of a 500 element uniformly distributed time series and its histogram
using 32 bins. The top-right image shows the result of adding two uniformly
distributed and independent time series together and the 32 bin histogram.
The bottom-left image is the result after adding three uniformly distributed
times series and the bottom-right image is the result of adding four uniformly
distributed times series.
6
exp(6x2 /XN )
N
where Pi (x) = P (x), i and N is large. Figure 4 illustrates
the effect of successively adding uniformly distributed but
independent random times series (each consisting of 500
elements) and plotting the resulting histograms (using 32 bins),
i.e. given the discrete times series f1 [i], f2 [i], f3 [i], f4 [i] for
i=1 to 500, Figure 4 shows the time series
distribution. In particular, we note that for a standard normal

(Gaussian) distribution given by
Gauss(x; , ) =
where
1
1
exp
2
2
Gauss(x)dx = 1
s1 [i] = f1 [i],
s2 [i] = f1 [i] + f2 [i],

s3 [i] = f1 [i] + f2 [i] + f3 [i],
s4 [i] = f1 [i] + f2 [i] + f3 [i] + f4 [i]
and the corresponding 32-bin histograms of the signals sj , j =
1, 2, 3, 4. Clearly as j increases, the histogram starts to look
increasing normally distributed. Here, the uniformly distributed discrete time series fi , i = 1, 2, 3, 4 have been computed
using the uniform pseudo random number generator
and
Gauss(x) exp(ikx)dx = exp(ik) exp
where a = 77 and P = 232 1 is a Mersenne prime number,

by using four different seeds f0 in order to provide time series
that are independent.
The Central Limit Theorem has been considered specically
for the case of uniformly distributed independent random
variables. However, in general, it is approximately applicable
for all independent random variables, irrespective of their
then, since
Gauss(x) exp(ik) exp
2 k2
2
Gauss(x) exp(ikN ) exp
fi+1 = afi modP
2 k2
2
j=1
N 2 k2
2
so that
N
j=1
Gauss(x) =
1
2N 2
exp
1
2N
where denotes transformation form real to Fourier

space. In other words, the addition of Gaussian distributed
elds produces a Gaussian distributed eld.

36
VII. S TOCHASTIC D IFFUSION

Given the classical diffusion/confusion model of the type
u(r) = p(r) r u0 (r) + n(r)
discussed above, we note that both the operator and the
functional form of p are derived from solving a physical
problem (using a Greens function solution) compounded in
a particular PDE - diffusion equation. This is an example
of Gaussian diffusion since the characteristic Point Spread
Function is a Gaussian function. However, we can use this
basic model and consider a variety of PSFs as required.
Although arbitrary changes to the PSF are inconsistent with
classical diffusion, in cryptology we can, in principal, choose
any PSF that is of value in diffusing the data. For example,
in Fresnel optics [61], [62], the PSF is of the same Gaussian
form but with a complex exponential. If f (x, y) is the object
function describing the object plane and u(x, y) is the image
plane wave function, then [63], [64]
u(x, y) = p(x, y) f (x, y)
where the PSF p is given by (ignoring scaling) [52]
p(x, y) = exp[i(x2 + y 2 )]; | x | X, | y | Y
where = /(z), being the wavelength and z the distance
between the object and image planes, and where X and Y
determine the spatial support of the PSF.
Stochastic diffusion involves interchanging the roles of p
and n, i.e. replacing p(r) - a deterministic PSF - with n(r) a stochastic function. Thus, noise diffusion is compounded in
the result
u(r) = n(r) r u0 (r) + p(r)
where p can be any function or
u(r) = n1 (r) r u0 (r) + n2 (r)
where both n1 and n2 are stochastic function which may be of
the same type (i.e. have the same PDFs) or of different types
(with different PDFs). This form of diffusion is not physical
in the sense that it does not conform to a physical model as
dened by the diffusion equation, for example. Here n(r) can
be any stochastic function (synthesized or otherwise).
The simplest form of noise diffusion is
u(r) = n(r) r u0 (r).
There are two approaches to solving the inverse problem:
Given u and n, obtain u0 . We can invert or deconvolve by
using the convolution theorem giving (for dimension n =
1, 2, 3)
1 U (k)N (k)
u0 (r) = Fn
| N (k) |2
where N is the Fourier transform of n and U is the Fourier
transform of u. However, this approach requires regularisation
in order to eliminate any singularities when | N |2 = 0 through
application of a constrained deconvolution lter such as the
Wiener lter [52]. Alternatively, if n is the result of some
random number generating algorithm, we can construct the

stochastic eld
N (k)
1
m(r) = Fn
| N (k) |2
where | N (k) |2 > 0, the diffused eld now being given by
u(r) = m(r) r u0 (r).
The inverse problem is then solved by correlating u with n,
since
n(r) r u(r) N (k)U (k)
and
N (k)U (k) = N (k)M (k)U0 (k)
= N (k)
N (k)
U0 (k) = U0 (k)
| N (k) |2
so that
u0 (r) = n(r)
u(r).
The condition that | N (k) | > 0 is simply achieved by

implementing the following process: k, if | N (k) |2 = 0,
then | N (k) |2 = 1. This result can be used to embed one
data eld in another.
Consider the case when we have two independent images
I1 (x, y) 0x, y and I2 (x, y) 0x, y and we consider
the case of embedding I1 with I2 . We construct a noise eld
m(x, y) 0x, y a priori and consider the equation
u(x, y) = Rm(x, y) I1 (x, y) + I2 (x, y)
where
m(x, y) I1 (x, y)
= 1 and
I2 (x, y)
= 1.
By normalising the terms in this way, the coefcient 0 R

1, can be used to adjust the relative magnitudes of the terms
such that the diffused image I1 is a perturbation of the host
image I2 . This provides us with a way of watermarking [65]
one image with another, R being referred to as the watermarking ratio13 . This approach could of course be implemented
using a Fresnel diffuser. However, for applications in image
watermarking, the diffusion of an image with a noise eld
provides a superior result because: (i) a noise eld provides
more uniform diffusion; (ii) noise elds can be generated using
random number generators that depend on a single initial value
or seed (i.e. a private key). An example of this approach is
shown in Figure 5. Here an image I2 (the host image) is
watermarked by another image I1 (the watermark image) and
because R = 0.1, the output u is dominated by the image
I2 . The noise eld n, is computed using a uniform random
number generator in which the output array n is normalized
so that n = 1 and used to generate n(xi , yi ) on a rowby-row basis. Here, the seed is any integer such as 1873...
which can be based on the application of a PIN (Personal
Identity Number) or a password (e.g. Enigma, which in terms
of an ASCII string - using binary to decimal conversion - is
216257556149). Recovery of the watermark image requires
13 Equivalent, in this application, to the standard term Signal-to-Noise or
SNR ratio as used in signal and image analysis.

37
Fig. 6. Binary image (top-left), uniformly distributed 2D noise eld (topcentre), convolution (top-right) and associated 64-bin histograms (bottom-left,
-centre and -right respectively).
Fig. 5. Example of watermarking an image with another image using noise
based diffusion. The host image I2 (top-left ) is watermarked with the
watermark image I1 (top-centre) using the diffuser (top-right) given by a
uniform noise eld n whose pixel-by-pixel values depend upon the seed used
(the private key). The result of computing m I1 (bottom-left) is added to
the host image for R = 0.1 to generate the watermarked image u (bottomcentre). Recovery of the watermark image I1 (bottom-right) is accomplished
by subtracting the host image from the watermarked image and correlating
the result with the noise eld n.
knowledge of the PIN or Password and the host image I2

The effect of adding the diffused watermark image to the
host image yields a different, slightly brighter image because
of the perturbation of I2 by Rm I1 . This effect can be
minimized by introducing a smaller watermarking ratio such
that the perturbation is still recoverable by subtracting the host
image from the watermarked image.
The expected statistical distribution associated with the
output of a noise diffusion process is Gaussian. This can be
shown if we consider u0 to be a strictly deterministic function
described by a sum of delta functions, equivalent to a binary
stream in 1D or a binary image in 2D (discrete cases), for
example. Thus if
n (r ri )
u0 (r) =
i
then
u(r) = n(r) r u0 (r) =
n(r ri ).
i=1
Now, each function n(r ri ) is just n(r) shifted by ri and

will thus be identically distributed. Hence
N
n(r ri ) =
Pr[u(r)] = Pr
i=1
Pr[n(r)]
i=1
and from the Central Limit Theorem, we can expect Pr[u(r)]

to be normally distributed for large N . This is illustrated in
Figure 6 which shows the statistical distributions associated
with a binary image, a uniformly distributed noise eld and
the output obtained by convolving the two elds together.
Given the equation
u(r) = p(r) r u0 (r) + n(r),
if the diffusion by noise is based on interchanging p and
n, then the diffusion of noise is based on interchanging u0
and n. In effect, this means that we consider the initial eld

u0 to be a stochastic function. Note that the solution to
the inhomogeneous diffusion equation for a stochastic source
S(r, t) = s(r)(t) is
n(r, t) = G(r, t) r s(r)
and thus, n can be considered to be diffused noise. If we
consider the model
u(r) = p(r) r n(r),
then for the classical diffusion equation, the PSF is a Gaussian
function. In general, given the convolution operation, p can
be regarded as only one of a number of PSFs that can be
considered in the production of different stochastic elds u.
This includes PSFs that dene self-afne stochastic elds or
random scaling fractals [66]-[68] that are based on fractional
diffusion processes.
A. Print Authentication
The method discussed above refer to electronic-to-electronic
type communications in which there is no loss of information.
Steganography and watermarking techniques can be developed
for hardcopy data which has a range of applications. These
techniques have to be robust to the signicant distortions
generated by the printing and/or scanning process. A simple
approach is to add information to a printed page that is difcult
to see. For example, some modern colour laser printers,
including those manufactured by HP and Xerox, print tiny
yellow dots which are added to each page. The dots are barely
visible and contain encoded printer serial numbers, date and
time stamps. This facility provides a useful forensics tool for
tracking the origins of a printed document which has only
relatively recently been disclosed.
If the watermarked image is printed and scanned back into
electronic form, then the print/scan process will yield an array
of pixels that will be signicantly different from the original
electronic image even though it might look the same. These
differences can include the size of the image, its orientation,
brightness, contrast and so on. Of all the processes involved
in the recovery of the watermark, the subtraction of the host

38
image from the watermarked image is critical. If this process

is not accurate on a pixel-by-pixel basis and deregistered for
any of many reasons, then recovery of the watermark by
correlation will not be effective. However, if we make use
of the diffusion process alone, then the watermark can be
recovered via a print/scan because of the compatibility of the
processes involved. However, in this case, the watermark is
not covert but overt.
Depending on the printing process applied, a number of distortions will occur which diffuse the information being printed.
Thus, in general, we can consider the printing process to
introduce an effect that can be represented by the convolution
equation
uprint = pprint u.
where u is the original electronic form of a diffused image
(i.e. u = n u0 ) and pprint is the point spread function of
the printer. An incoherent image of the data, obtained using a
at bed scanner for example (or any other incoherent optical
imaging system) will also have a characteristic point spread
function pscan . Thus, we can consider a scanned image to be
given by
uscan = pscan uprint
where uscan is taken to be the digital image obtain from the
scan. Now, because convolution is commutative, we can write
uscan = pscan pprint p u0 = p pscan/print u0
where
pscan/print = pscan pprint
which is the print/scan point spread function associated with
the processing cycle of printing the image and then scanning
it. By applying the method discussed earlier, we can obtain a
reconstruction of the watermark whose delity is determined
by the scan/print point spread function. However, in practice,
the scanned image needs to be re-sized to that of the original.
This is due to the scaling relationship (for a function f with
Fourier transform F )
f (x, y)
1
F
kx ky
,

The size of any image captured by a scanner or other device

will depend on the resolution used. The size of the image
obtained will inevitably be different from the original because
of the resolution and window size used to print the diffused
image u and the resolution used to scan the image. Since
scaling in the spatial domain causes inverse scaling in the
Fourier domain, the scaling effect must be inverted before the
watermark can be recovered by correlation since correlation
is not a scale invariant process. Re-sizing the image (using an
appropriate interpolation scheme such as the bi-cubic method,
for example) requires a set of two numbers n and m (i.e.
the n m array used to generate the noise eld and execute
the diffusion process) that, along with the seed required to
regenerate the noise eld, provides the private keys needed
to recover the data from the diffused image. An example of
this approach is given in Figure 7 which shows the result of reconstructing four different images (a photograph, nger-print,
Fig. 7. Example of the application of diffusion only watermarking. In this

example, four images of a face, nger-print, signature and text have been
diffused using the same noise eld m and printed on the front (top-left) and
back (bottom-left) of an impersonalized identity card using a 600 dpi printer.
The reconstructions (top-right and bottom-right, respectively) are obtained
using a conventional at-bed scanner based on a 300 dpi grey-level scan.
signature and text) used in the design of an impersonalized

debit/credit card. The use of diffusion only watermarking
for print security can be undertaken in colour by applying
exactly the same diffusion/reconstruction methods to the red,
green and blue components independently. This provides two
additional advantages: (i) the effect of using colour tends
to yield better quality reconstructions because of the colour
combination process; (ii) for each colour component, it is
possible to apply a noise eld with a different seed. In this
case, three keys are required to recover the watermark.
Because this method is based on convolution alone and since
uscan = pscan/print u0
as discussed earlier, the recovery of the f will not be negated
by the distortion of the point spread function associated
with the print/scan process, just limited or otherwise by its
characteristics. Thus, if an image is obtained of the printed data
eld p u0 which is out of focus due to the characteristics
of pscan/print , then the reconstruction of u0 will be out of
focus to the same degree. Decryption of images with this
characteristic is only possible using an encryption scheme that
is based a diffusion only approach. Figure 8 illustrates the
recovery of a diffused image printed onto a personal identity
card obtained using a at bed scanner and then captured using
mobile phone camera. In the latter case, the reconstruction is
not in focus because of the wide-eld nature of the lens used.
However, the fact that recovery of the watermark is possible
with a mobile phone means that the scrambled data can be
transmitted securely and the card holders image (as in this
example) recovered remotely and transmitted back to the same
phone for authentication. This provides the necessary physical
security needed to implement such a scheme in practice and
means that specialist image capture devices are not required
on site.
The diffusion process can be carried out using a variety
of different noise elds other than the uniform noise eld
39

Fig. 9. Example of the diffusion of composite images with the inclusion of

a reference frame for enhancing and automating the processes of copping and
orientation. In each case the data elds have been printed and scanned at 300
dpi.
Fig. 8. Original image (top-left), diffused image (top-right), reconstruction

using a atbed scanner (bottom-left) and reconstruction using a mobile phone
(bottom-right). These images have been scanned in grey scale from the
original colour versions printed on to a personalised identity card at 600dpi
stamp-size (i.e. 2cm1.5cm).
a few pixels) and oriented correctly. The processes of cropping

and oreintion can be enhanced and automated by providing a
reference frame in which the diffused image is inserted. This
is illustrated in Figure 9 which, in addition shows the effect
of diffusing a combination of images. This has the effect of
producing a diffused eld that is very similar but nevertheless
conveys entirely different information.
B. Covert Watermarking
considered here. Changing the noise eld can be of value

in two respects: rst, it allows a system to be designed that,
in addition to specic keys, is based on specic algorithms
which must be known a priori. These algorithms can be
based on different pseudo uniform random number generators
and/or different pseudo chaotic number generators that are
post-processed to provide a uniform distribution of numbers.
Second, the diffusion eld depends on both the characteristics
of the watermark image and the noise eld. By utilizing
different noise elds (e.g. Gaussian noise, Poisson noise,
fractal noise and so on), the texture of the output eld can
be changed. The use of different noise elds is of value
when different textures are required that are aesthetically
pleasing and can be used to create a background that is printed
over the entire document. In this sense, variable noise based
diffusion elds can be used to replace complex print security
features with the added advantage that, by de-diffusing them,
information can be recovered. Further, these elds are very
robust to data degradation created by soiling, for example. In
the case of binary watermark images, data redundancy allows
reconstructions to be generated from a binary output, i.e. after
binarizing the diffusion eld (with a threshold of 50% for
example). This allows the output to be transmitted in a form
that can tolerate low resolution and low contrast copying, e.g.
a fax.
The tolerance of this method to printing and scanning is
excellent provided the output is cropped accurately (to within
Watermarking is usually considered to be a method in which

the watermark is embedded into a host image in an unobtrusive
way. Another approach is to consider the host image to be
a data eld that, when processed with another data eld,
generates new information.
Consider two images i1 and i2 . Suppose we construct the
following function
n = F2
I1
I2
| I1 |2
where I1 = F2 [i1 ] and I2 = F2 [i2 ]. If we now correlate n

with i1 , then from the correlation theorem
i1
n I1
I1
I2 i2 .
| I1 |2
In other words, we can recover i2 from i1 with a knowledge of

n. Because this process is based on convolution and correlation
alone, it is compatible and robust to printing and scanning,
i.e. incoherent optical imaging. An example of this is given in
Figure 10. In this scheme, the noise eld n is the private key
required to reconstruct the watermark and the host image can
be considered to be a public key.
C. Application to Encryption
One of the principal components associated with the development of methods and algorithms to break cyphertext is
the analysis of the output generated by an attempted decrypt
and its evaluation in terms of an expected type. The output
40

or bits to describe any one card. If the deck contained 16

cards, the information would be 4 bits and if it contained 32
cards, the information would be 5 bits and so on. Thus, in
general, for any number of possibilities N , the information I
for specifying a member in such a linear array, is given by
1
N
where the negative sign is introduced to denote that information has to be acquired in order to make the correct choice,
i.e. I is negative for all values of N larger than 1. We can now
generalize further by considering the case where the number
of choices N are subdivided into subsets of uniform size ni . In
this case, the information needed to specify the membership
of a subset is given not by N but by N/ni and hence, the
information is given by
I = log2 N = log2
Fig. 10.
Example of a covert watermarking scheme. i1 (top-left) is
processed with i2 (top-middle) to produce the noise eld (top-right). i2 is
printed at 600 dpi, scanned at 300 dpi and then re-sampled back to its original
size (bottom-left). Correlating this image with the noise eld generates the
reconstruction (bottom-centre). The reconstruction depends on just the host
image and noise eld. If the noise eld and/or the host image are different
or corrupted, then a reconstruction is not achieved, as in the example given
(bottom-right).
type is normally assumed to be plain text, i.e. the output is

assumed to be in the form of characters, words and phrases
associated with a natural language such as English or German,
for example. If a plain text document is converted into an
image le then the method described in the previous Section
on covert watermarking can be used to diffuse the plain text
image i2 using any other image i1 to produce the eld n. If
both i1 and n are then encrypted, any attack on these data will
not be able to make use of an analysis cycle which is based
on the assumption that the decrypted output is plaintext. This
approach provides the user with a relatively simple method of
confusing the cryptanalyst and invalidates attack strategies
that have been designed and developed on the assumption that
the encrypted data have been derived from plaintext alone.
VIII. E NTROPY C ONSCIOUS C ONFUSION AND D IFFUSION
Consider a simple linear array such as a deck of eight cards
which contains the ace of diamonds for example and where
we are allowed to ask a series of sequential questions as to
where in the array the card is. The rst question we could
ask is in which half of the array does the card occur which
reduces the number of cards to four. The second question is in
which half of the remaining four cards is the ace of diamonds
to be found leaving just two cards and the nal question is
which card is it. Each successive question is the same but
applied to successive subdivisions of the deck and in this way
we obtain the result in three steps regardless of where the
card happens to be in the deck. Each question is a binary
choice and in this example, 3 is the minimum number of binary
choices which represents the amount of information required
to locate the card in a particular arrangement. This is the same
as taking the binary logarithm of the number of possibilities,
since log2 8 = 3. Another way of appreciating this result, is
to consider a binary representation of the array of cards, i.e.
000,001,010,011,100,101,110,111, which requires three digits
Ii = log2 Pi
where Pi = ni /N which is the proportion of the subsets.
Finally, if we consider the most general case, where the subsets
are non-uniform in size, then the information will no longer
be the same for all subsets. In this case, we can consider the
mean information given by
N
I=
Pi log2 Pi
i=1
which is the Shannon Entropy measure established in his

classic works on information theory in the 1940s [69]. Information, as dened here, is a dimensionless quantity. However,
its partner entity in physics has a dimension called Entropy
which was rst introduced by Ludwig Boltzmann as a measure
of the dispersal of energy, in a sense, a measure of disorder,
just as information is a measure of order. In fact, Boltzmanns
Entropy concept has the same mathematical roots as Shannons
information concept in terms of computing the probabilities of
sorting objects into bins (a set of N into subsets of size ni ) and
in statistical mechanics the Entropy is dened as [70], [71]
E = k
Pi ln Pi
i
where k is Boltzmanns constant. Shannons and Boltzmanns

equations are similar. E and I have opposite signs, but
otherwise differ only by their scaling factors and they convert
to one another by E = (k ln 2)I. Thus, an Entropy unit
is equal to k ln 2 of a bit. In Boltzmanns equation, the
probabilities Pi refer to internal energy levels. In Shannons
equations Pi are not a priori assigned such specic roles and
the expression can be applied to any physical system to provide
a measure of order. Thus, information becomes a concept
equivalent to Entropy and any system can be described in
terms of one or the other. An increase in Entropy implies
a decrease of information and vise versa. This gives rise to
the fundamental conservation law: The sum of (macroscopic)
information change and Entropy change in a given system is
zero.
From the point of view of designing an appropriate substitution cipher, the discussion above clearly dictates that the cipher
n[i] should be such that the Entropy of the ciphertext u[i] is a

41
Fig. 12. 128-bin histograms for an 7-bit ASCII plaintext u0 [i] (left), a
stream of uniformly distributed integers between 0 and 127 n[i] (centre) and
the substitution cipher u[i] (right).
Fig. 11. A 3000 element uniformly distributed random number stream (top
left) and its 64-bin discrete PDF (top right) with I = 4.1825 and a 3000
element Gaussian distributed random number stream (bottom left) and its 64bin discrete PDF (bottom right) with I = 3.2678.
not uniform. Thus, the Entropy of the ciphertext, although

larger than the plaintext (in this example Iu0 = 3.4491 and
Iu = 5.3200), the Entropy of the ciphertext is still less that
then that of the cipher (in this example In = 5.5302). There
are two ways in which this problem can be solved. The rst
method is to construct a cipher n with a PDF such that
Pn (x) Pu0 (x) = U (x)
where U (x) = 1, x. Then
maximum. This requires that a PRNG algorithm be designed

that outputs a number stream whose Entropy is maximum as large as is possible in practice. The stream should have a
PDF Pi that yields the largest possible values for I. Figure
11 shows a uniformly distributed and a Gaussian distributed
random number stream consisting of 3000 elements and the
characteristic discrete PDFs using 64-bins (i.e. for N = 64).
The Information Entropy, which is computed directly from the
PDFs using the expression for I given above, is always greater
for the uniformly distributed eld. This is to be expected
because, for a uniformly distributed eld, there is no bias
associated with any particular numerical range and hence, no
likelihood can be associated with a particular state. Hence,
one of the underlying principles associated with the design of
a cipher n[i] is that it should output a uniformly distributed
sequence of random numbers. However, this does not mean
that the ciphertext itself will be uniformly distributed since if
u(r) = u0 (r) + n(r)
then
Pr[u(r)] = Pr[u0 (r)] Pr[n(r)].
This is illustrated in Figure 12 which shows 128-bin histograms for a 7-bit ASCII plaintext (the LaTeX le associated
with this paper) u0 [i], a stream of uniformly distributed integers n[i], 0 n 127 and the ciphertext u[i] = u0 [i] + n[i].
The spike associate with the plaintext histogram reects the
character that is most likely to occur in the plaintext of a
natural Indo-European language, i.e. a space with ASCII value
32. Although the distribution of the ciphertext is broader than
the plaintext it is not as broad as the cipher and certainly
Pn (x) = U (x) Q(x)

where
1
Q(x) = F1
1
F[Pu0 (x)]
But this requires that the cipher is generated in such a way

that its output conforms to an arbitrary PDF as determined
by the plaintext to be encrypted. The second method is
based on assuming that the PDF of all plaintexts will be of
the form given in Figure 12 with a characteristic dominant
spike associated with the number of spaces that occur in the
plaintext14 . Noting that
Pn (x) (x) = Pn (x)
then as the amplitude of the spike increases, the output increasingly approximates a uniform distribution; the Entropy of the
ciphertext increases as the Entropy of the plaintext decreases.
One simple way to implement this result is to pad-out the
plaintext with a single character. Padding out a plaintext
le with any character provides a ciphertext with a broader
distribution, the character ? (with an ASCII decimal integer of
63) providing a symmetric result. The statistical effect of this
is illustrated in Figure 13 where Iu0 = 1.1615, In = 5.5308
and Iu = 5.2537.
IX. S TATISTICAL P ROPERTIES OF A C IPHER
Diffusion has been considered via the properties associated
with the homogeneous (classical) diffusion equation and the
14 This is only possible provided the plaintext is an Indo-European alphanumeric array and is not some other language or le format - a compressed
image le, for example.

42
Fig. 13. 127-bin histograms for an 7-bit ASCII plaintext u0 [i] (left) after
space-character padding, a stream of uniformly distributed integers between
0 and 255 n[i] (centre) and the substitution cipher u[i] (right).
in the large majority of applications in which u is a record of

a physical quantity.
In cryptology, the diffusion/confusion model is used in
a variety of applications that are based on diffusion only,
confusion only and combined diffusion/confusion models. One
such example of the combined model is illustrated in Figure 5
which shows how one data eld can be embedded in another
eld (i.e. how one image can be used to watermark another
image using noise diffusion). In standard cryptography, one
of the most conventional methods of encrypting information
is through application of a confusion only model. This is
equivalent to implementing a model where it is assumed that
the PSF is a delta function so that
u(r) = u0 (r) + n(r).
If we consider the discrete case in one-dimension, then
general Greens function solution. Confusion has been considered through the application of the inhomogeneous diffusion
equation with a stochastic source function and it has been
shown that
u(r) = p(r) r u0 (r) + n(r)
where p is a Gaussian Point Spread Function and n is a
stochastic function.
Diffusion of noise involves the case when u0 is a stochastic
function. Diffusion by noise involves the use of a PSF p that
is a stochastic function. If u0 is taken to be deterministic
information, then we can consider the processes of noise
diffusion and confusion to be compounded in terms of the
following:
Diffusion
u(r) = n(r) r u0 (r).
Confusion
u(r) = u0 (r) + n(r).
u[i] = u0 [i] + n[i]

where u0 [i] is the plaintext array or just plaintext (a stream
of integer numbers, each element representing a symbol associated with some natural language, for example), n[i] is
the cipher and u[i] is the ciphertext. Methods are then
considered for the generation of stochastic functions n[i] that
are best suited for the generation of the ciphertext. This is
the basis for the majority of substitution ciphers where each
value of each element of u0 [i] is substituted for another value
through the addition of a stochastic function n[i], a function
that should: (i) include outputs that are zero in order that the
spectrum of random numbers is complete15 ; (ii) have a uniform
PDF. The conventional approach to doing this is to design
appropriate PRNGs or, as discussed later in this work, pseudo
chaotic ciphers. In either case, a cipher should be generated
with maximum Entropy which is equivalent to ensuring that
the cipher is a uniformly distributed stochastic eld. However,
it is important to appreciate that the statistics of a plaintext
are not the same as those of the cipher when encryption is
undertaken using a confusion only model; instead the statistics
are determined by the convolution of the PDF of the plaintext
with the PDF of the cipher. Thus, if
Diffusion and Confusion

u(r) = u0 (r) + n(r)
u(r) = n1 (r) r u0 (r) + n2 (r).

then
The principal effects of diffusion and confusion have been
illustrated using various test images. This has been undertaken
for visual purposes only but on the understanding that such
effects apply to elds in different dimensions in a similar
way.
The statistical properties associated with independent random variables has also bee considered. One of the most
signicant results associated with random variable theory is
compounded in the Central Limit Theorem. When data is
recorded, the stochastic term n, is often the result of many
independent sources of noise due to a variety of physical,
electronic and measuring errors. Each of these sources may
have a well-dened PDF but if n is the result of the addition
of each of them, then the PDF of n tends to be Gaussian distributed. Thus, Gaussian distributed noise tends to be common
Pr[u(r)] = Pr[n(r)] r Pr[u0 (r)].

One way of maximising the Entropy of u is to construct u0
such that Pr[u0 (r)] = (r). A simple and practical method
of doing this is to pad the data u0 with a single element that
increase the data size but does not intrude on the legibility of
the plaintext.
Assuming that the encryption of a plaintext u0 is undertaken
using a confusion only model, there exist the possibility of
encrypting the ciphertext again. This is an example of double
encryption, a process that can be repeated an arbitrary number
15 The Enigma cipher, for example, suffered from a design fault with regard
to this issue in that a letter could not reproduce its self - u[i] = u0 [i]i. This
provided a small statistical bias which was nevertheless signicant in the
decryption of Enigma ciphers.

43
of times to give triple and quadruple encrypted outputs.

However, multiple encryption procedures in which
u(r) = u0 (r) + n1 (r) + n2 (r) + ...
where n1 , n2 ,... are different ciphers, each consisting of uniformly distributed noise, suffer from the fact that the resultant
cipher is normally distributed because, from the Central Limit
Theorem
Pr[n1 + n2 + ...] Gauss(x).
For this reason, multiple encryption systems are generally not
preferable to single encryption systems. A notable example
is the triple DES (Data Encryption Standard) or DES3 system
[72] that is based on a form of triple encryption and originally
introduced to increase the key length associated with the
generation of a single cipher n1 . DES3 was endorsed by the
National Institute of Standards and Technology (NIST) as a
temporary standard to be used until the Advanced Encryption
Standard (AES) was completed in 2001 [73].
The statistics of an encrypted eld formed by the diffusion
of u0 (assumed to be a binary eld) with noise produces an
output that is Gaussian distributed, i.e. if
u(r) = n(r) r u0 (r)
then
Pr[u(r)] = Pr[n(r) r u0 (r)] Gauss(x).
Thus, the diffusion of u0 produces an output whose statistics
are not uniform but normally distributed. The Entropy of a
diffused eld using uniformly distributed noise is therefore
less than the Entropy of a confused eld. It is for this reason,
that a process of diffusion should ideally be accompanied by
a process of confusion when such processes are applied to
cryptology in general.
The application of noise diffusion for embedding or watermarking one information eld in another is an approach that
has a range of applications in covert ciphertext transmission.
However, since the diffusion of noise by a deterministic
PSF produces an output whose statistics tend to be normally
distributed, such elds are not best suited for encryption.
However, this process is important in the design of stochastic
elds that have important properties for the camouage of
encrypted data.
X. I TERATED F UNCTION S YSTEMS AND C HAOS
In cryptography, the design of specialized random number
generators with idealized properties forms the basis of many of
the algorithms that are applied. Although the type of random
number generators considered so far are of value in the generation of noise elds, the properties of these algorithms are
not well suited for cryptography especially if the cryptosystem
is based on a public domain algorithm. This is because it is
relatively easy to apply brute force attacks in order to recover
the parameters used to drive a known algorithm especially
when there is a known set of rules required to optimise the
algorithm in terms of parameter specications. In general
stream ciphers typically use an iteration of the type
xi+1 = f (xi , p1 , p2 , ...)
where pi is some parameter set (e.g. prime numbers) and x0 is

the key. The cipher x, which is usually of decimal integer type,
is then written in binary form (typically using ASCII 7-bit
code) and the resulting bit stream used to encrypt the plaintext
(after conversion to a bit stream with the same code) using an
XOR operation. The output bit stream can then be converted
back to ASCII ciphertext form as required. Decryption is
then undertaken by generating the same cipher (for the same
key) and applying an XOR operation to the ciphertext (binary
stream). The encryption/decryption procedure is thus of the
same type and attention is focused on the characteristics of
the algorithm that is used for computing the cipher. However, whatever algorithm is designed and irrespective of its
strength and the length of the key that is used, in all cases,
symmetric systems require the users to exchange the key. This
requires the use of certain key exchange algorithms. Stream
ciphers are essential Vernam type ciphers which encrypt bit
streams on a bit by bit basis. By comparison block ciphers
operate of blocks of the stream and may apply permutations
and shifts to the data which depend on the key used. In this
section we provide the foundations for the use of IFS for
generating Vernam ciphers that are constructed from random
length blocks of data that are based on the application different
IFS.
A. Background to Chaos
The word Chaos appeared in early Greek writings and
denoted either the primeval emptiness of the universe before
things came into being or the abyss of the underworld. Both
concepts occur in the Theogony of Hesiod16 . This concept tied
in with other early notions that saw in Chaos the darkness of
the underworld. In later Greek works, Chaos was taken to
describe the original state of things, irrespective of the way
they were conceived. The modern meaning of the word is
derived from Ovid (Publius Ovidius Naso - known to the
English speaking world as Ovid), a Roman poet (43BC 17AD) and a major inuence in early Latin literature, who
saw Chaos as the original disordered and formless mass, from
which the ordered universe was derived.
The modern notion of chaos - apart from being a term to
describe a mess - is connected with the behaviour of dynamical
systems that appear to exhibit erratic and non-predictable
behaviour but, on closer inspection, reveal properties that
have denable structures. Thus, compared with the original
Greek concept of chaos, chaotic systems can reveal order,
bounded forms and determinism, a principal feature being their
self-organisation and characterisation in terms of self-afne
structures. This aspect of chaos immediately suggests that
chaotic systems are not suitable for applications to cryptography which requires ciphers that have no predictable dynamic
behaviour or structure of any type, e.g. pseudo random number
streams that are uniformly distributed with maximum entropy.
However, by applying appropriate post-conditioning criterion
to a pseudo chaotic number stream, a cipher can be designed
that has the desired properties.
16 Hesiod, 700 BC, one of the earliest Greek poets. His epic Theogony
describes the myths of the gods.

44
The idea that a simple nonlinear but entirely deterministic

systems can behave in an apparently unpredictable and chaotic
manner was rst noticed by the great French mathematician
Henri Poincar in the late Nineteenth Century. In spite of
e
this, the importance of chaos was not fully appreciated until
the widespread availability of digital computers for numerical
simulations and the demonstration of chaos in various physical
systems. In the early 1960s, the American mathematician,
Edward Lorenz re-discovered Poincar s observations while
e
investigating the numerical solution of a system of non-linear
equations used to model atmospheric turbulence, equations
that are now known as the Lorenz equations.
A primary feature of chaotic systems is that they exhibit
self-afne structures when visualised and analysed in an appropriate way, i.e. an appropriate phase space. In this sense, the
geometry of a chaotic system may be considered to be fractal.
This is the principal feature that provides a link between
chaotic dynamics and fractal geometry.
A key feature of chaotic behavior in different systems is
the sensitivity to initial conditions. Thus, It may happen that
small differences in the initial conditions produce very great
ones in the nal phenomena. A small error in the former
will produce an enormous error in the future. Prediction
becomes impossible (Edward Lorenz17 ). This aspect of a
chaotic system is ideal for encryption in terms of the diffusion
requirement discussed earlier, i.e. that a cryptographic system
should be sensitivity to the initial conditions (i.e. the key)
that is applied. However, in a more general context, the
sensitivity to initial conditions of chaotic systems theory is an
important aspect of using the theory to develop a mathematical
description of complex phenomena such a Brownian and
fractaional Brownian process, weather changes in meteorology
or population uctuations in biology. The relative success of
chaos theory for modelling complex phenomena has caused an
important paradigm shift that has provided the rst scientic
explanation for the coexistence of such concepts as law and
disorder, determinism and unpredictability.
Formally, chaos theory can be dened as the study of
complex nonlinear dynamic systems. The word complex is
related to the recursive and nonlinear characteristics of the
algorithms involved, and word dynamic implies the nonconstant and non-periodic nature of such systems. Chaotic
systems are commonly based on recursive processes, either in
the form of single or coupled algebraic equations or a set of
(single or coupled) differential equations modeling a physical
or virtual system.
Chaos is often but incorrectly associated with noise in
that it is taken to represent a eld which is unpredictable.
Although this is the case, a eld generated by a chaotic system
generally has more structure if analysed in an appropriate
way, a structure that may exhibits features that are similar
at different scales. Thus, chaotic eld are not the same as
noise elds either in terms of their behaviour or the way in
which they are generated. Simple chaotic elds are typically
the product of an iteration of the form xi+1 = f (xi ) where the
function f is some nonlinear map which depends on a single
17 Cambel,
A B, Applied Chaos Theory, Gorman, 2000.
or a set of parameters. The chaotic behaviour of xi depends

critically of the value of the parameter(s). The iteration process
may not necessarily be a single nonlinear mapping but consist
of a set of nonlinear coupled equations of the form
(1)
(1)
(2)
(N )
),
(2)
(1)
(2)
(N )
),
xi+1 = f1 (xi , xi , ..., xi
xi+1 = f2 (xi , xi , ..., xi

.
.
.
(N )
(1)
(2)
(N )
xi+1 = fN (xi , xi , ..., xi
where the functions f1 , f2 , ..., fN may all be nonlinear or

nonlinear and linear. In turn, such a coupled system can be
the result of many different physical models covering a wide
range of applications in science and engineering.
B. Vurhulst Processes and the Logistic Map
Suppose there is a xed population of N individuals living
on an island (with no one leaving or entering) and a fatal
disease (for which there is no cure) is introduced, which
is spread through personal contact causing an epidemic to
break out. The rate of growth of the disease will normally be
proportional to the number of carriers c say. Suppose we let
x = c/N be the proportion of individuals with the disease so
that 100x is the percentage of the population with the disease.
Then, the equation describing the rate of growth of the disease
is
dx
= kx
dt
whose solution is
x(t) = x0 exp(kt)
where x0 is the proportion or the population carrying the
disease at t = 0 (i.e. when the disease rst strikes) and
k is a constant of proportionality dening the growth rate.
The problem with this conventional growth rate model, is
that when x = 1, there can be no further growth of the
disease because the island population no longer exists and
so we must impose the condition that 0 < x(t) 1, t.
Alternatively, suppose we include the fact that the rate of
growth must also be proportional to the number of individuals
1 x who do not become carriers, due to isolation of their
activities and/or genetic disposition, for example. Then, our
rate equation becomes
dx
= kx(1 x)
dt
and if x = 1, the epidemic is extinguished. This equation
can be used to model a range of situations similar to that
introduced above associated with predator-prey type processes.
(In the example given above, the prey is the human and the
predator could be a virus or bacterium, for example). Finite
differencing over a time interval t, we have
xi+1 xi
= kxi (1 xi )
t
or
xi+1 = xi + ktxi (1 xi )

45
or
xi+1 = rxi (1 xi )
where r = 1 + kt. This is a simple quadratic iterator known
as the logistic map and has a range of characteristics depending
on the value of r. This is illustrated in Figure 14 which
shows the output (for just 30 elements) from this iterator
for r = 1, r = 2, r = 3 and r = 4 and for an initial
value of 0.118 . For r = 1 and r = 2, convergent behaviour
Fig. 15. Feigenbaum diagram of the logistic map for 0 < r < 4 and
0 < x < 1.
Fig. 14. Output (30 elements) of the logistic map for values of r = 1 (top
left), r = 2 (top right), r = 3 (bottom left) and r = 4 (bottom right) and an
initial value of 0.1.
takes place; for r = 3 the output is oscillatory and for

r = 4 the behaviour is chaotic. The transition from monotonic
convergence to oscillatory behaviour is known as a bifurcation
and is better illustrated using a so called Fiegenbaum map
or diagram which is a plot of the output of the iterator in
terms of the values produced (after iterating enough times
to produce a consistent output) for different values of r. An
example of this for the logistic map is given in Figure 15
for 0 < r 4 and shows convergent behaviour for values
of r from 0 to approximately 3, bifurcations for values of r
between approximately 3 and just beyond 3.5 and then a region
of chaotic behaviour, achieving full chaos at r = 4 where,
in each case, the output consists of values between 0 and 1.
However, closer inspection of this data representation reveals
repeating patterns, an example being given in Figure 16 which
is a Fiegenbaum diagram of the output for values of r between
3.840 and 3.855 and values of x between 0.44 and 0.52. As
before, we observe a region of convergence, bifurcation and
then chaos. Moreover, from Figure 16 we observe another
region of this map (for values of r around 3.854) in which this
same behaviour occurs. The interesting feature about this map
is that the convergencebifurcationchaos characteristics are
repeated albeit at smaller scales. In other words, there is
a similarity of behaviour at smaller scales, i.e. the pattern
18 The initial value, which is taken to be any value between 0 and 1, changes
the signature of the output but not its characteristics, at least, in the ideal
case.
Fig. 16. Feigenbaum diagram of the logistic map for 3.840 < r < 3.855
and 0.44 < x < 0.52.
of behaviour. Further, this complex behaviour comes from a

remarkably simple iterator, i.e. the map x rx(1 x).
C. Examples of Chaotic Systems
In addition to the logistic map, which has been used in
the previous section to introduce a simple IFS that gives
a chaotic output, there are a wide variety of other maps
which yield signals that exhibit the same basic properties
as the logistic map (convergencebifurcationchaos) with
similar structures at different scales at specic regions of
the Feigenbaum diagram. Examples, include the maps given
below:
1) Linear functions: The sawtooth map
xi+1 = 5xi mod4.
The tent map
xi+1 = r(1 | 2xi 1 |).

46
The generalized tent map

xi+1 = r(1 | 2xi 1 |m ), m = 1, 2, 3, ...
2) Nonlinear functions: The sin map
xi+1 =| sin(rxi | .
The tangent feedback map
xi+1 = rxi [1 tan(xi /2)].
The logarithmic feedback map
xi+1 = rxi [1 log(1 + xi )].
Further, there are a number of variations on a theme that are
of value, an example being the delayed logistic map
xi+1 = rxi (1 xi1 )
which arises in certain problems to population dynamics.
Moreover, coupled iterative maps occur from the development
of physical models leading to nonlinear coupled differential
equations, a famous and historically important example being
the Lorenz equations given by
Fig. 17. Two dimensional phase space analysis of the Lorenz equations
illustrating the strange attractor.
dx2
dx3
dx1
= a(x2 x1 ),
= (bx3 )x1 x2 ,
= x1 x2 cx3
dt
dt
dt
where a, b and c are constants. These equations were originally
derived by Lorenz from the uid equations of motion (the
Navier Stokes equation, the equation for thermal conductivity
and the continuity equation) used to model heat convection in
the atmosphere and were studied in an attempt to explore the
transition to turbulence where a uid layer in a gravitational
eld is heated from below. By nite differencing, these equations, we convert the functions xi , i = 1, 2, 3 into discrete
form xn , i = 1, 2, 3 giving (using forward differencing)
i
dx1
dpx
dx2
= px ,
= x2xy;
= py ,
dt
dt
dt
(n+1)
x1
(n+1)
x2
(n+1)
x3
(n)
(n)
(n)
= x1 + ta(x2 x1 ),
(n)
(n) (n)
(n)
= x2 + t[(b x3 )x1 x2 ],
(n)
(n) (n)
(n)
= x3 + t[x1 x2 cx3 ].
For specic values of a, b and c (e.g. a = 10, b = 28 and

(n)
(n)
c = 8/3) and a step length t, the digital signals x1 , x2
(n)
and x3 exhibit chaotic behaviour which can be analysed
quantitatively in the three dimension phase space (x1 , x2 , x3 )
or variations on this theme, e.g. a three dimensional plot with
axes (x1 + x2 , x3 , x1 x3 ) or as a two dimensional projection
with axes (x1 +x2 , x3 ) an example of which is shown in Figure
17. Here, we see that the path is conned to two domains
which are connected. The path is attracted to one domain and
then to another but this connection (the point at which the path
changes form one domain to the next) occurs in an erratic way
- an example of a strange attractor.
As with the simple iterative maps discussed previously,
there are a number of nonlinear differential equations (coupled
or otherwise) that exhibit chaos whose behaviour can be
quantied using an appropriate phase space. These in include:
The R ssler equations
o
dx2
dx3
dx1
= x2 x3 ,
= x3 + ax2 ,
= b + x3 (x1 + c).
dt
dt
dt
The H non-Heiles equations

e
dpy
= yx2 +y 2 .
dt
The Hills equations

d2
x(t) + 2 (t)x(t) = 0,
dt2
a special case being the Mathieu equation when
2
2 (t) = 0 (1 + cos t),
0 and being constants. The Dufng Oscillator

dv
dx
= v,
= av + x + bx3 + cos t
dt
dt
where a and b are constants. The non-linear Schr dinger
o
equation
d2
x(t) + 2 x(t) =| x(t) |2 x(t).
dt2
In each case, the chaotic nature of the output to these systems
depends on the values of the constants.
For iterative processes where stable convergent behaviour is
expected, an output that is characterised by exponential growth
can be taken to be due to unacceptable numerical instability.
However, with IFS that exhibit intrinsic instability, in that the
output does not converge to a specic value, the Lyapunov
exponent is used to quantify the characteristics of the output.
This exponent or Dimension provides a principal measure of
chaoticity and is derived in Appendix II.
XI. E NCRYPTION USING D ETERMINISTIC C HAOS
The use of chaos in cryptology was rst considered in the
early 1950s by the American electrical engineer Claude Shannon and the Russian mathematician Vladimir Alexandrovich
Kotelnikov who laid the theoretical foundations for modern
information theory and cryptography. It was Shannon who
rst explicitly mentioned the basic stretch-and-fold mechanism
47

associated with chaos for the purpose of encryption: Good mixing transformations are often formed by repeated products of
two simple non-commuting operations [11]. Hopf19 considered
the mixing of dough by such a sequence of non-commuting
operations. The dough is rst rolled out into a thin slab, then
folded over, then rolled, and then folded again and so on. The
same principle is used in the making of a Japanese sword,
the aim being to produce a material that is a highly diffused
version of the original material structure.
The use of chaos in cryptography was not fully appreciated until the late 1980s when the simulation of chaotic
dynamical systems became common place and when the role
of cryptography in IT became increasingly important. Since
the start of the 1990s, an increasing number of publications
have considered the use of chaos in cryptography, e.g. [74][78]. These have included schemes based on synchronized
chaotic (analogue) circuits, for example, which belong to
the eld of steganography and secure radio communication
[79]. Over the 1990s cryptography started to attract a variety
of scientists and engineers from diverse elds who started
exploiting dynamical systems theory for the purpose of encryption. This included the use of discrete chaotic systems
such as the cellular automata, Kolmogorov ows and discrete
afne transformations in general to provide more efcient
encryption schemes [80]-[83]. Since 2000, the potential of
chaos-based communications, especially with regard to spread
spectrum modulation, has been recognized. Many authors have
described chaotic modulations and suggested a variety of
electronics based implementations, e.g. [76]-[79]. However,
the emphasis has been on information coding and information
hiding and embedding. Much of this published work has been
of theoretical and some technological interest with work being
undertaken in both an academic and industrial research context
(e.g. [84]-[90]). However, it is only relatively recently that
the application of chaos-based ciphers have been implemented
in software and introduced to the market. One example of
this is the basis of the authors own company - CrypsticTM
Limited - in which the principle of multi-algorithmicity using
chaos-based ciphers [11], [91] has been use to produce metaencryption engines that are mounted on a single, a pair or a
group of ash (USB - Universal Serial Bus) memory sticks.
Some of these memory sticks have been designed to include
a hidden memory accessible through a covert procedure (such
as the renaming - by delation - of an existing le or folder)
from which the encryption engine(s) can be executed.
Consider an algorithm that outputs a number stream which
can be ordered, chaotic or random. In the case of an ordered
number stream (those generated from a discretized piecewise
continuous functions for example), the complexity of the eld
is clearly low. Moreover, the information and specically the
information entropy (the lack of information we have about the
exact state of the number stream) is low as is the information
content that can be conveyed by such a number stream.
A random number stream (taken to have a uniform PDF,
19 Hopf, Eberhard F. F, (1902-1983), an Austrian mathematician who made
signicant contributions in topology and ergodic theory and studied the mixing
in compact spaces, e.g. On Causality, Statistics and Probability, Journal of
Mathematics and Physics, 13, 51-102, 1934.
for example) will provide a sequence from which, under ideal

circumstances, it is not possible to predict any number in the
sequence from the previous values. All we can say is that
the probability of any number occurring between a specied
range is equally likely. In this case, the information entropy
is high. However, the complexity of the eld, in terms erratic
transitions from one type of localized behaviour to another,
is low. Thus, in comparison to a random eld, a chaotic
eld is high in complexity but its information entropy, while
naturally higher than an ordered eld is lower than that of a
random eld, e.g. chaotic elds which exhibit uniform number
distributions are rare. Such elds therefore need to be postprocessed in order that the output conforms to a uniform
distribution.
We consider a dynamic continuous-state continuous-time
system S = X, K, f as follows:
dx
= f (x, k) ,
dt
x X, k K
where f is a a smooth function, X is a state space and K is a

parameter space. The equation is taken to satisfy the conditions
of the existence and uniqueness of solutions x(x0 , t) with the
initial condition x0 = x (x0 , 0), the solution curve t (x0 , t)
being the trajectory.
For cryptography, we focus on dynamic discrete-time systems which can be written in the following form:
xi+1 = f (xi , k) , xi X, k K, i = 0, 1, 2, . . .
where xi is a discrete state of the system. The trajectory
(xi , x0 ) is dened by the sequence x0 , x1 , x2 , . . .. This
equation is similar to the cryptographic iterated functions
used for pseudo random number generation, block ciphers and
other constructions such as the DES, RSA nd AES ciphers.
Consequently, in both nonlinear dynamics and cryptography
we deal with an iterated key-dependent transformation of
information. There are several sufcient conditions satised
by a dynamic system to guarantee chaos; the sensitivity to
initial conditions and topological transitivity being the most
common.
A chaotic continuous-state discrete-time system is a dynamic system S = X, f with two properties [?]: (i) given a
metric space X and a mapping f : X X, we say that f is
topologically transitive on X if, for any two open sets U, V
X, there is n 0 such that f n (U ) V = ; (ii) the map f is
said to be sensitive to initial conditions if there is > 0, n 0
given that for any x X and for any neighborhood Hx of
x there is y Hx , such that |f n (x) f n (y)| > . These
properties can be interpreted as follows: a dynamic system
is chaotic if all trajectories are bounded (by the attractor)
and nearby trajectories diverge exponentially at every point
of the phase space. The trajectories are continuous and belong
to a two-dimensional system that is said to be chaotic. This
yields to a natural synergy between chaotic and cryptographic
systems that can be described in terms of the following: (i)
topological transitivity which ensures that the system output
covers all the state space, e.g. any plaintext can be encrypted
into any ciphertext; (ii) sensitivity to initial condition which

48
corresponds to Shannons original requirements for an encryption system in the late 1940s. In both chaos and cryptography
we are dealing with systems in which a small variation of any
variable changes the outputs considerably.
A. Stream Cipher Encryption
The use of discrete chaotic elds for encrypting data can
follow the same basic approach as used with regard to the
application of pseudo random number generating algorithms
for stream ciphers. Pseudo chaotic numbers are in principle,
ideal for cryptography because they produce number streams
that are ultra-sensitive to the initial value (the key). However,
instead of using iterative based maps using modular arithmetic
with integer operations, here, we require the application of
nonlinear maps using oating point arithmetic. Thus, the rst
drawback concerning the application of deterministic chaos
for encryption concerns the processing speed, i.e. pseudo random number generators typically output integer streams using
integer arithmetic whereas pseudo chaotic number generators
produce oating point streams using oating point arithmetic.
Another drawback of chaos based cryptography is that the
cycle length (i.e. the period over which the number stream
repeats itself) is relatively short and not easily quantiable
when compared to the cycle length available using conventional PRNGs, e.g. additive generators, which commence by
initialising an array xi with random numbers (not all of which
are even) so that we can consider the initial state of the
generator to be x1 , x2 , x3 , ... to which we then apply
xi = (xia + xib + ... + xim )mod2n
where a, b, ..., m and n are assigned integers20 , have very long
cycle lengths of the order of 2f (255 1) where 0 f n
and linear feedback shift registers with the form
xn = (c1 xn1 + c2 xn2 + cm xnm )mod2k
which, for specic values of c1 , c2 , ...cm have cycle lengths
of 2k .
The application of deterministic chaos to encryption has two
distinct disadvantages relative to the application of PRNGs.
Another feature of IFS is that the regions over which chaotic
behaviour can be generated may be limited. However, this
limitation can be overcome by designing IFS with the specic
aim of increasing the range of chaos. One method is to use
well known maps and modify them to extend the region of
chaos. For example, the Matthews cipher is a modication of
the logistic map to [92]
xi+1 = (1 + r) 1 +
1
r
xi (1 xi )r , r (0, 4].
The effect of this generalization is seen in Figure 18 which

shows the Feigenbaum diagram for values of r between 1 and
4. Compared to the conventional logistic map xi+1 = rxi (1
xi ), r (0, 4] which yields full chaos at r = 4, the chaotic
behaviour of the Matthews map is clearly more extensive
providing full chaos for the majority (but not all) of values of
20 A well known example is the Fish generator x
i = (xi55 +
xi24 )mod232
Fig. 18.
Feigenbaum map of the Matthews cipher
r between approximately 0.5 and 4. In the conventional case,

the key is the value of x0 (the initial condition). In addition,
because there is a wide range of chaotic behaviour for the
Matthews map, the value of r itself can be used as a primary
(or secondary) key.
The approach to using deterministic chaos for encryption
has to date, been based on using conventional and other
well known chaotic models of the type discussed above with
modications such as the Matthew map as required. However,
in cryptography, the physical model from which a chaotic
map has been derived is not important; only the fact that
the map provides a cipher that is good at scrambling the
plaintext in terms of diffusion and confusion. This point leads
to an approach which exploits two basic features of chaotic
maps: (i) they increase the complexity of the cipher; (ii)
there are an unlimited number of maps of the form xi+1 =
f (xi ), for example, that can be literally invented and then
tested for chaoticity to produce a data base of algorithms.
However, it is important to stress that such ciphers, once
invented, needs to be post-processed to ensure that the cipher
stream is uniformly distributed which, in turn, requires further
computational overheads and, as discussed in the following
section, may include signicant cipher redundancy.
The low cycle lengths associated with chaotic iterators can
be overcome by designing block ciphers where the iterator
produces a cipher stream only over a block of data whose
length is signicantly less than that of the cycle length of
the iterator, each block being encrypted using a different
key and/or algorithm. The use of different algorithms for
encrypting different blocks of data provides an approach that
is multi-algorithmic.
B. Block Cipher Encryption and Multi-algorithmicity
Instead of using a single algorithm (such as a Matthews
cipher) to encrypt data over a series of blocks using different
(block) keys, we can use different algorithms, i.e. chaotic
maps. Two maps can be used to generate the length of each

49
block and the maps that are used to encrypt the plaintext
over each block. Thus, suppose we have designed a data
base consisting of 100 chaotic maps, say, consisting of IFS
f1 , f2 , f3 , ..., f100 , each of which generates a oating point
number steam through the operation
xi+1 = fm (xi , p1 , p2 , ...)
where the parameters p1 , p2 , ... are pre-set or hard-wired
to produce chaos for any initial value x0 (0, 1) say. An
algorithm selection key is then introduced in which two
algorithms (or possibly the same algorithm) are chosen to
drive the block cipher - f50 and f29 say, the key in this
case being (50, 29). Here, we shall consider the case where
map f50 determines the algorithm selection and map f29
determines the block size. Map f50 is then initiated with the
key 0.26735625 say and map f29 with the key 0.65376301 say.
The output from these maps (oating point number streams)
are then normalized, multiplied by 100 and 1000, respectively,
for example, and then rounded to produce integer streams with
values ranging from 1 to 100 and 1 to 1000, respectively. Let
us suppose that the rst few values of these integer streams
are 28, 58, 3, 61 and 202, 38, 785, 426, respectively. The block
encryption starts by using map 28 to encrypt 202 elements of
the plaintext using the key 0.78654876 say. The second block
of 38 elements is then encrypted using map 58 (the initial
value being the last oating point value produced by algorithm
28) and the third block of 785 elements is encrypted using
algorithm 3 (the initial value being the last oating point value
produced by algorithm 58) and so on. The process continues
until the plaintext has been fully encrypted with the session
key (50,29,0.26735625,0.65376301,0.78654876).
Encryption is typically undertaken using a binary representation of the plaintext and applying an XOR operation
using a binary representation of the cipher stream. This can be
constructed using a variety of ways. For example, one could
extract the last signicant bits from the oating point format
of xi . Another approach, is to divide the oating point range
of the cipher into two compact regions and apply a suitable
threshold. For example, suppose that the output xi from a map
operating over a given block consists of oating point value
between 0 and 1, then, with the application of a threshold of
0.5, we can consider generating the bit stream
b(xi ) =
1, xi (0.5, 1];
0, xi [0, 0.5).
However, in applying such a scheme, we are assuming that

the distribution of xi is uniform and this is rarely the case
with chaotic maps. Figure 19 shows the PDF for the logistic
map xi+1 = 4xi (1 xi ) which reveals a non-uniform
distribution with a bias for oating point numbers approaching
0 and 1. However, the mid range (i.e. for xi [0.3, 0.7]) is
relatively at indicating that the probability for the occurrence
of different numbers generated by the logistic map in the mid
range is the same. In order to apply the threshold partitioning
method discussed above in a way that provides an output that
is uniformly distributed for a any chaotic map, it is necessary
to introduce appropriate conditions and modify the above to
Fig. 19. Probability density function (with 100 bins) of the output from the
logistic map for 10000 iterations.
Fig. 20. Illustration of the effect of using multiple algorithms for generating
a stream cipher on the computational Energy required to attempt a brute force
attack.
the form
1,
xi [T, T + + );
b(xi ) = 0,
xi [T , T );
1, otherwise.
where T is the threshold and + and are those values
which yield an output stream that characterizes (to a good
approximation) a uniform distribution. For example, in the
case of the logistic map T = 0.5 and + = =
0.2. This aspect of the application of deterministic chaos to
cryptography, together with the search for a parameter or set
of parameters that provides full chaos for an invented map,
determines the overall suitability of the function that has been
invented for this application.
The ltering of a chaotic eld to generate a uniformly distributed output is equivalent to maximizing the entropy of the
cipher stream (i.e. generating a cipher stream with a uniform
PDF) which is an essential condition in cryptography. In terms
of cryptanalysis and attack, the multi-algorithmic approach to
designing a block cipher introduces a new dimension to the
problem. The conventional problem associated with an attack
on a symmetric cipher is to search for the private key(s) given
knowledge of the algorithm. Here, the problem is to search
not only for the session key(s), but the algorithms they drive
as illustrated in Figure 20.
One over-riding issue concerning cryptology in general, is
that algorithm secrecy is weak. In other words, a cryptographic
system should not rely of the secrecy of its algorithms and

50
all such algorithms should be openly published21 . The system

described here is multi-algorithmic, relying on many different
chaotic maps to encrypt the data. Here, publication of the
algorithms can be done in the knowledge that many more maps
can be invented as required (subject to appropriate conditions
in terms of generating a fully chaotic eld with a uniform PDF)
by a programmer, or possibly with appropriate training of a
digital computer.
respect to the required (maximum entropy) performance conditions through implementation of the appropriate threshold
parameters T and . The design is based on applying the
following basic steps:
XII. C RYPSTICTM
Step 3: Graph the output xi and adjust parameters p1 , p2 , ...

until the output looks chaotic.
CrypsticTM is the trade mark for a USB based product that

currently uses three approaches for providing secure mobile
information exchange: (i) obfuscation; (ii) disinformation; (iii)
multi-algorithmic encryption using chaos22 . The product has
been designed for oating point computations with 32-bit
precision operating on PC platforms with an XP or Vista
environment.
A. Obfuscation and Disinformation
Obfuscation is undertaken by embedding the application
(the .exe le that performs the encryption/decryption) in an
environment (i.e. the USB memory) that contains a wealth
of data (les and folders etc.) that is ideally designed to
reet the users portfolio. This can includes areas that are
password protected and other public domain encryption systems with encrypted les as required that may be broken
and even generate apparently valuable information (given a
successful attack) but are in fact provided purely as a form
of disinformation. This environment is designed in order to
provide a potential attacker, who has gained access to a
users CrypsticTM through theft, for example, with a target
rich environment. The rationale associated with the use of a
CrypsticTM as a mobile encryption/decryption device follows
that associated with a users management of a key ring. In
other words, it is assumed that the user will maintain and
implement the CrypsticTM in the same way as a conventional
set of keys are used. However, in the case of loss or theft, a new
CrypsticTM must be issued which includes a new encryption
engine and under no circumstances is the original CrypsticTM
re-issued. Management of the encryption engines and their
distribution is of course undertaken by CrypsticTM Limited
which maintains a data base of current users and the encryption
engines provided to them in compliance with the RIP Act,
2000, Section 49, which deals with the power of disclosure, i.e.
for CrypsticTM Limited to provide the appropriate encryption
engine for the decryption of any encrypted data that is under
investigation by an appropriate authority.
B. Encryption Engine Design
The encryption engine itself can be based on any number
of algorithms, each algorithm having been designed with
21 Except for some algorithms developed by certain government agencies perhaps they have something to hide!
22 Incorporated under the Companies Act 1985 for England and Wales as
a Private Company that is Limited on 19th January, 2005; Company Number
5337521.
Step 1: Invent a (non-linear) function f and apply the IFS

xi+1 = f (xi , p1 , p2 , ...)
Step 2: Normalise the output of the IFS so that x = 1.
Step 4: Graph the histogram of the output and observe if there

is a signicant region of the histogram over which it is at.
Step 5: Set the values of the thresholds T and based on
observations made in Step 4.
Anlysing of the ISF using a Faigenbaum diagram can also be
undertaken but this can be computationally intensive. Further,
each ISF can be categorised in terms of paremeters such as the
Laypunov Dimension (Appendix II) and information entropy,
for example. However, in practice, such parmeters yield little
in terms of the design of an IFS and are primarily academic.
Indeed, the invention and design of such algorithms has a
certain art to it which improves with experience. It should
be noted that many such inventions fail to be of value in that
the statistics may not be suitable (e.g. the PDF may not be
at enough or is at only over a vary limited portion of the
PDF), chaoticity may not be guaranteed for all values of the
seed x0 between 0 and 1 and the numerical performance of
the algorithm may be poor. The aim is to obtain a simple
IFS that is numerically relatively trivial to compute, has a
broad statistical distribution and is valid for all oating point
values of x0 between 0 and 1. Examples of the IFS used for
CrypsticTM are given in the following table where the values
of T , + and apply to the normalised output stream
generated by each function.
Function f (x)
rx(1 tan(x/2))
rx[1 x(1 + x2 )]
rx[1 x log(1 + x)]
r(1 | 2x 1 |1.456 )
| sin(rx1.09778 ) |
r
3.3725
3.17
2.816
0.9999
0.9990
T
0.5
0.5
0.6
0.5
0.6
+
0.3
0.25
0.3
0.3
0.25
0.3
0.35
0.2
0.3
0.25
The functions given in the table above produce outputs that

have a relatively broad and smooth histogram which can be
made at by application of the values of T and goven.
Some functions, however, produce poor characteristic in this
respect. For example, the function
f (x) = r | 1 tan(sin x) |, r = 1.5
has a highly irregular histogram (see Figure 21) which is not
suitable in terms of applying values of T and and, as such,
is not an appropriate IFS a crypstic application
C. Graphical User Interface
In conventional encryption systems, it is typical to provide
a Graphical User Interface (GUI) with elds for inputting

51
Fig. 22.
Fig. 21. The rst 1000 elements for xi+1 = r | 1 tan(sin xi ) |, r =

1.5, 0 < x0 < 1 (above) and associated histogram (below).
the plaintext and outputting the ciphertext where the name

of the output (including le extension) is supplied by the user.
CrypsticTM outputs the ciphertext by overwriting the input
le. This allows the le name, including the extension, to be
used to seed the encryption engine and thus requires that the
name of the le remains unchanged in order to decrypt. The
seed is used to initiate the session key discussed in Section
XI(B). The le name is converted to an ASCII 7-bit decimal
integer stream which is then concatenated and the resulting
decimal integer used to seed a hash function whose output is
of the form (d, d, f, f, f ) where d is a decimal integer and f
is a 32-bit precision oating point number between 0 and 1.
The executable le is camouaged as a .dll le which is
embedded in a folder containing many such .dll les. The
reason for this is that the structure a .dll le is close to
that of a .exe le. Nevertheless, this requires that the source
code must be written in such a way that all references to
its application are void as discussed in Section II(E). This
includes all references to the nature of the data processing
involved including words such as Encrypt and Decrypt, for
example23 , so that the compiled le, although camouaged as a
.dll le, is forensically inert to attacks undertaken with systems
such a WinHEX [93]. In other words, the source code should
be written in a way that is incomprehensible, a condition
that is consistent with the skills of many software engineers!
This must include the development of a run time help facility. Clearly, such criteria are at odds with the conventional
wisdom associated with the development of applications but
the purpose of this approach is to develop a forensically inert
executable le that is obfuscated by the environment in which
it is placed. An example of the GUI is given in Figure 22.
D. Procedure
The approach to loading the application to encrypt/decrypt
a le is based on renaming the .dll le to an .exe le with a
given name as well as the correct extension. Simply renaming
a .dll le in this way can lead to a possible breach of security
by a potential attacker using a key logging system [94]. In oder
to avoid such an attack, CrypsticTM uses an approach in which
the name of the .dll le can be renamed to a .exe le by using
23 Words
that can be replaced by E and D respectively in a GUI.
GUI of CrypsticTM encryption application.
a deletion dominant procedure. For example, suppose the

application is called enigma.exe, then by generating a .dll le
called engine gmax index.dll, renaming can be accomplished
by deleting (in the order given) lld. followed by dni x followed
by en followed by g and then inserting a . between ae and
including e after ex. A further application is required such that
upon closing the application, the .exe le is renamed back to
its original .dll form. This includes ensuring that the time and
date stamps associated with the le are not updated.
The procedure described above is an attampt to obfuscate
the use of passwords which are increasingly open to attack
[18]. For example, the Russian based company Elcomsoft
Limited recently led a US patent for a password cracking
technique that relies on the parallel processing capabilities
of modern graphics processors. The technique increases the
speed of password cracking by a factor of 25 using a GeForce
8800 Ultra graphics card from Nvidia. Cracking times can be
reduced from days or hours to minutes in some instances and
there are plans to introduce the technique into password cracking products (http://techreport.com/discussions.x/13460).
E. Protocol
CrypsticTM is a symmetric encryption system that relies on
the user working with a USB memory stick and maintaining
a protocol that is consistent with the use of a conventional
set of keys, typically located on a key ring. The simplest
use of CrysticTM is for a single user to be issued with a
CrypsticTM which incorporates an encryption engine that is
unique (through the utilisation of a unique set of algorithms
which is registered with CrypsticTM Limited for a given user).
The user can then use the CrypsticTM to encrypt/decrypt les
and/or folders (after application of a compression algorithm
such as pkzip, for example) on a PC before closure of a
session. In this way, the user maintains a secure environment
using a unique encryption engine with a key that includes a
covert access route, coupled with appropriate disinformation
as discussed in previous sections. Different encryption engines
can be incorporated that are used to encrypt disinformation in
order to provide a solution to the gun to the head problem
as required.
In the case of communications between Alice and Bob,
both users are issued with Crystics that have encryption
engines unique to Alice and Bob, each of whom can use the
facility for personal data security as above, and, in addition,
can encrypt les for email communications. If any stick, by

52
any party, is lost, then a new pair of Crypstics are issued with
new encryption engines unique to both parties. In addition to
a two-party user system, Crypstics can be issued to groups of
users in a way that provides an appropriate access hierarchy
as required.
XIII. D ISCUSSION
The material discussed in this paper has covered some of
the basic principles associated with cryptography in general,
including the role of diffusion and confusion for designing
ciphers that have no statistical bias. This has been used as a
guide in the design of ciphers that are based on the application
of IFS exhibiting chaotic behaviour. The use of IFS allow for
the design of encryption engines that are multi-algorithmic,
each algorithm being based on an IFS that is invented, subject
to the condition, that the output stream has a uniform PDF.
The principle of multi-algorithmicity has been used to develop
a new product - CrypsticTM - that is based on the following:
(i) a multi-algorithmic block encryption engine consisting of
a unique set of IFS; (ii) maximum entropy conversion to a bit
stream cipher; (iii) a key that is determined by the le name to
be encrypted/decrypted. The approach has passed all statistical
tests [11] recommended by National Institute of Standards and
Technology (NIST) [95].
Access and use of the encryption engine is based on
utilizing an commercial-off-the-shelf USB ash memory via
a combination of camouage, obfuscation and disinformation
in order to elude any potential attacker. The approach has
been based on respecting the following issues: (i) security is a
process not a product; (ii) never underestimate the enemy; (iii)
the longer that any cryptosystem, or part thereof, remains of
the same type with the same function, the more vulnerable
the system becomes to a successful attack. Point (iii) is a
singularly important feature which CrypsticTM overcomes by
utilizing a dynamic approach to the design and distribution of
encryption engines.
A. Chaos Theory and Cryptography
We have discussed cryptography in the context of chaos
theory and there is clearly a fundamental relationship between
cryptography and chaos. In both cases, the object of study is
a dynamic system that performs an iterative nonlinear transformation of information in an apparently unpredictable but
deterministic manner. In terms of chaos theory, the sensitivity
to the initial conditions together with the mixing property
ensures cryptographic confusion and diffusion, as originally
suggested by Shannon. However, there are also a number
of conceptual differences: (i) chaos theory studies dynamic
systems dened on an innite state space (e.g. vectors of
real numbers or innite binary strings), whereas cryptography
relies on a nite-state machine and all chaos models implemented on a computer are approximations, i. e. pseudo-chaos;
(ii) chaos theory studies the asymptotic behaviour of a nonlinear system (the behaviour as the number of iteration approach
innity when the Lyapunov dimension can be quantied - see
Appendix II in which the denition of the Lyapunov dimension
is based on N ), whereas cryptography focuses on the
effect of a small number of iterations; (iii) chaos theory is not

concerned with the algorithmic complexity of the IFS, while
in cryptography, complexity is the key issue; in other words,
the concepts of cryptographic security and efciency have no
counterparts in chaos theory; (iv) classical chaotic systems
have visually recognizable attractors where as in cryptography,
we attempt to hide any visible structure; (v) chaos theory is
often associated with the mathematical model used to quantify
a physically signicant problem, whereas in cryptography,
the physical model is of no importance; (vi) unlike chaos in
general, cryptographic systems use a combination of all independent variables to provide an output that is unpredictable
to an observer. The following table provides a comparison
between chaos theory and cryptography in terms of those
aspects of the two subjects that have been considered in his
paper.
Chaos Theory
Chaotic system
Nonlinear transform
Innite number of states
Innite number of iterations
Initial state
Final state
Initial condition(s)
and/or parameter(s)
Asyptotic independenece of
initial and nal states
Sensitivity to initial
condition(s) and
parameter(s) mixing
Cryptography
Pseudo-chaotic system
Nonlinear transform
Finite states
Finite iterations
Plaintext
Ciphertext
key
Confusion
Diffusion
Chaotic systems are algorithmically random and thus cannot

be predicted by a deterministic Turing machine even with
innite power. However, chaotic systems are predictable by
a probabilistic Turing machine and thus, nding probabilistically unpredictable chaotic systems is a central problem for
chaos based cryptography. In this paper, the generation of an
unpredictable cipher has been undertaken by ltering those
numbers that belong to a uniform partition of the PDF. This
approach comes at the expense of numerical performance since
a relatively large percentage of the oating point numbers that
are computed are discarded.
Chaos theory is not related to number theory in the same
way as conventional cryptographic algorithms nor is chaos
theory related to the computational complexity analysis that
underpins digital cryptography. Hence, neither chaos, nor
pseudo-chaos can guarantee pseudo-randomness and resistance
to different kinds of cryptanalysis based on conventional senarios. The use of oating-point arithmetic is the most obvious
solution of approximating continuous chaos on a nite-state
machine. However, there is no straightforward application
to pseudo-random number generation and cipher generation.
Critical problems can include: (i) growing rounding-off errors;
(ii) structural instability, i.e. different initial conditions and
parameters yield different cryptographic properties, such as
very short cycles, weak plaintexts or weak keys.
53

Chaotic systems based on smooth nonlinear functions (e.g.

x2 , sin(x), tan(x) and log(x)) produce sequences with a
highly non-uniform distribution and can therefore be predicted
by a probabilistic machine. By applying a partitioning strategy
to generate a uniform output, a bit stream cipher with uniform
statistical properties can be obtained which passes all pseudorandomness tests. Some piecewise-linear maps generate sequences, which have theoretically at distributions. However,
in practice, these maps are less suitable than nonlinear maps
because of the overall effect of linearity, rounding and iterative
transformations and may be characterised by highly nonuniform statistics. The need to post-process the outputs form
chaotic iterators in order to provide bit-streams with no statistical bias leads to a cryptosystem that is relatively inefcient
when compared to conventional PRNGs. Further, the lack
of any fundamental theoretical framework with regard to the
pseudo-random properties of IFS leads to a basic incompatibility with modern cryptography. However, this is off-set by
the exibility associated with the use multi-algorithmicity for
generating numerous and, theoretically, an unlimited number
of unique encryption engines.
All conventional cryptographic systems (encryption
schemes, pseudo-random generators, hash functions) can
be considered to be binary pseudo-chaotic systems, based
on bit stream encryption dened over a nite space. Such
systems are periodic but have a limited sensitivity to the
initial conditions, i.e. the Lyapunov exponents are positive
only if measured at the beginning of the process (before one
can see the cycles). The mixing property leads to pseudorandomness. Pseudo-chaotic systems typically have many
orbits of different length. Measuring the minimal, average
and maximal length of a system is not a trivial problem, but
clearly, ideal cryptographic systems have a single orbit that
includes all the possible states.
Iterative block ciphers can be viewed as a combination of
two linked pseudo-chaotic systems; data and round-key systems. The iterated functions of such system includes nonlinear
substitutions, row shifts, column mixing etc. The round-key
system is a pseudo-random generator providing a sensitivity
dependence of the ciphertext on the key. Technically, most
pseudo-random generators are based on the stretch-and-fold
transformation: rst, the state is stretched over a large space
(e.g, multiplying or raising in power), then folded into the
original state space (using a periodic function such as mod
and sin). In mathematical chaos, the stretch-and-fold transformation forms the basis of the majority of iterated functions.
In the design on any chaos based cryptosystem, it is of
paramount important to have a structurally stable cryptosystem, i.e. a system that has (almost) the same cycle length
and Lyapunov exponents for all initial conditions and a given
control parameter set. Many pseudo-chaotic systems do not
possess this quality. Approximations to chaos are usually based
on xed precision computations. However, it is possible to
increase the precision or resolution (e.g. the length of a binary
state string) in each iteration, a precision that can, according
to a set of rules, be used to estimate any error impact. A
one-way transformation forms the basis of a PRNG, whereas
a key-dependent invertible transformation is the essence of
a cipher or encryption scheme. Most chaos based ciphers

can be extended to include invertible transformations such
as XOR, cyclic shifts and other permutations and the latter
transformations can also be considered as pseudo-chaotic
maps. Further, asymmetric cryptographic systems and based
on trapdoor functions, i.e. functions that have the one-way
property unless a secret parameter (trapdoor) is known. No
counterpart of a trapdoor transformation is known in chaos
theory and thus it is not currently possible to produce an
equivalent to the RSA algorithm using an IFS. However, it is
noted that asymmetric encryption algorithms such as the RSA
algorithm can be used to transfer a database of algorithms
used for the multi-algorithmic symmetric encryption scheme
considered in this paper.
B. Covertext and Stegotext
One of the principal weaknesses of all encryption systems is
that the form of the output data (the ciphertext), if intercepted,
alerts the intruder to the fact that the information being
transmitted may have some importance and that it is therefore
worth attacking and attempting to decrypt it. In Figure 1,
for example, if a postal worker observed some sophisticated
strong box with an impressive lock passing through the post
ofce, it would be natural for them to wonder what might be
inside. It would also be natural to assume that the contents of
the box would have a value in proportion with the strength
of the box/lock. These aspects of ciphertext transmission can
be used to propagate disinformation, achieved by encrypting
information that is specically designed to be intercepted and
decrypted. In this case, we assume that the intercept will be
attacked, decrypted and the information retrieved. The key to
this approach is to make sure that the ciphertext is relatively
strong (but not too strong!) and that the information extracted
is of high quality in terms of providing the attacker with
intelligence that is perceived to be valuable and compatible
with their expectations, i.e. information that reects the concerns/interests of the individual(s) and/or organisation(s) that
encrypted the data. This approach provides the interceptor with
a honey pot designed to maximize their condence especially
when they have had to put a signicant amount of work in to
extracting it. The trick is to make sure that this process is
not too hard or too easy. Too hard will defeat the object of
the exercise as the attacker might give up; too easy, and the
attacker will suspect a set-up!
In addition to providing an attacker with a honey-pot for
the dissemination of disinformation it is of signicant value
if a method can be found that allows the real information to
be transmitted by embedding it in non-sensitive information
after (or otherwise) it has been encrypted, e.g. camouaging
the ciphertext using methods of Steganography. This provides
a signicant advantage over cryptography alone in that encrypted messages do not attract attention to themselves. No
matter how well plaintext is encrypted (i.e. how unbreakable
it is), by default, a ciphertext will arouse suspicion and may
in itself be incriminating, as in some countries encryption is
illegal. With reference to Figure 1, Steganography is equivalent
to transforming the strong box into some other object that

54
will pass through without being noticed - a chocolate-box,

for example.
The word Steganography is of Greek origin and means
covered, or hidden writing. In general, a steganographic
message appears as something else or Covertext. The conversion of a ciphertext to another plaintext form is called Stegotext conversion and is based on the use of covertext. Some
covertext must rst be invented and the ciphertext embedded
or mapped on to it in some way to produce the stegotext. The
basic principle is given in the following schematic diagram:
Data
Plaintext
Ciphertext
Covertext
Stegotext
Transmission
Note that this approach does not necessarily require the use
of plaintext to ciphertext conversion as illustrated above and
that plaintext can be converted into stegotext directly. For
example, a simple approach to this is to use a mask to delete
all characters in a message except those that are to be read by
the recipient of the message. Apart from establishing a method
of exchanging the mask, which is equivalent to the key in
cryptography, the principal problem with this approach is that
different messages have to be continuously invented in order
to accommodate hidden messages and that these inventions
must appear to be legitimate. However, the wealth of data
that is generated and transmitted in todays environment and
the wide variety of formats that are used, means that there is
much greater potential for exploiting steganographic methods
than were previously available. In other words, the wealth of
information now available has generated a camouage rich
environment in which to operate and one can attempt to
hide plaintext or ciphertext (or both) in a host of data types,
including audio and video les and digital images. Moreover, by understanding the characteristics of a transmission
environment, it is possible to conceive techniques in which
information can be embedded in the transmission noise, i.e.
where natural transmission noise is the covertext. There are
of course a range of counter measures - steganalysis - that
can be implemented in order to detect stegotext. However, the
techniques usually requires access to the covertext which is
then compared with the stegotext to see if any modications
have been introduced. The problem is to nd ways of obtaining
the original stegotext which is equivalent to a plaintext attack.
C. Hiding Data in Images
The relatively large amount of data contained in digital
images makes them a good medium for undertaking steganography. Consequently digital images can be used to hide
messages in other images. A colour image typically has 8
bits to represent the red, green and blue components. Each
colour component is composed of 256 colour values and the
modication of some of these values in order to hide other
data is undetectable by the human eye. This modication is
often undertaken by changing the least signicant bit in the
binary representation of a colour or grey level value (for grey
level digital images). For example, the grey level value 128
has the binary representation 10000000. If we change the least

signicant bit to give 10000001 (which corresponds to a grey
level value of 129) then the difference in the output image,
in terms of a single pixel, will not be discernable. Hence,
the least signicant bit can be used to encode information
through modication of pixel intensity. Further, if this is done
for each colour component then a letter of ASCII text can be
represented for every three pixels. The larger the host image
compared with the hidden image, the more difcult it is to
detect the message. Further, it is possible to hide an image
in another image for which there are a number of approaches
available.
CrypsticTM explicitly uses the method discussed in Section
VII on Stochastic Diffusion for steganographic applications.
The plaintext (which, in the case of written material, is limited
in this application to an image of a single text page) is rst
converted into an image le which is then diffused with a noise
eld that is generated by CrypsticTM . The host image (which
is embedded in an environment of different digital images) is
distributed with each CrypsticTM depending on the protocol
and user network associated with its application. Note that the
host image represents, quite literally, the key to recovering the
hidden image. The additive process applied is equivalent to the
process of confusion that is the basis for a substitution cipher.
Rather than the key being used to generate a random number
stream using a pre-dened algorithm from which the stream
can be re-generated (for the same key), the digital image is, in
effect, being used as the cipher. By diffusing the image with
a noise eld, it is possible to hide the output in a host image
without having to resort to quantization. In the case of large
plaintext documents, each page is converted into an image le
and the image stream embedded in a host video.
D. Hiding Data in Noise
The art of steganography is to use what ever covertext is
readily available to make the detection of plaintext or, ideally,
the ciphertext as difcult as possible. This means that the
embedding method used to introduce the plaintext/cipherext
into the covertext should produce a stegotext that is indistinguishable from the covertext in terms of its statistical
characteristics and/or the information it conveys. From an
information theoretic point of view, the covertext should have
signicantly more capacity than the cipheretext, i.e. there must
be a high level of redundancy. Utilising noisy environments
often provides an effective solution to this problem. There are
three approaches that can be considered: (i) embedding the
ciphertext in real noise; (ii) transforming the ciphertext into
noise that is then added to data; (iii) replacing real noise with
ciphertext that has been transformed in to synthetic noise with
exactly the same properties as the real noise.
In the rst case, we can make use of noise sources such
as thermal noise, icker noise, and shot noise associated with
electronics that digitize an analogue signal. In digital imaging
this may be noise from the imaging Charge Couple Device
(CCD) element; for digital audio, it may be noise associated
with the recording techniques used or amplication equipment.
Natural noise generated in electronic equipment usually provides enough variation in the captured digital information that

55
it can be exploited as a noise source to cover hidden data.

Because such noise is usually a linear combination of different
noise types generated by different physical mechanisms, it is
usually characterised by a normal or Gaussian distribution as
a result of the Central Limit Theorem (see Appendix I).
In the second case, the ciphertext is transformed into noise
whose properties are consistent with the noise that is to be
expected in certain data elds. For example, lossy compression
schemes (such as JPEG - Joint Photographic Expert Group)
always introduce some error (numerical error) into the decompressed data and this can be exploited for steganographic
purposes. By taking a clean image and adding ciphertext noise
to it, information can be transmitted covertly providing all
users of the image assume that it is the output of a JPEG or
some other lossy compressor. Of course, if such an image is
JPEG compressed, then the covert information may be badly
corrupted.
In the third case, we are required to analyse real noise and
derive an algorithm for its synthesis. Here, the noise has to
be carefully synthesized because it may be readily observable
as it represents the data stream in its entirety rather than
data that is cloaked in natural noise. This technique also
requires that the reconstruction/decryption method is robust
in the presence of real noise that we should assume will be
added to the synthesized noise during a transmission phase.
In this case, random fractal models are of value because
the spectral properties of many noise types found in nature
signify fractal properties to a good approximation [52], [68].
This includes transmission noise over a range of radio and
microwave spectra, for example, and Internet trafc noise
[33]. With regard to Internet trafc noise, the time series data
representing packet size and inter-arrival times shows well
dened random fractal properties with a fractal dimension
that varies over a 24 hour cycle. This can be used to submit
emails by fracturing les into byte sizes that characterise the
packet size time series and submitting each fractured le at
time intervals that characterise the inter-arrival times at the
point of submission[96], [97]. In both cases, the principal
characteristic is the fractal dimension computed from live
Internet data.
application of the convolution theorem) and then working with

a series representation of the result.
The Fourier transform of P (x) is given by
P (x) exp(ikx)dx
P (k) =
X/2
1
exp(ikx)dx = sinc(kX/2)
X
=
X/2
where sinc(x) = sin(x)/x - the sinc function. Thus,

P (x) sinc(kX/2)
where denotes transformation into Fourier space, and
from the convolution theorem in follows that
N
Q(x) =
Pi (x) sincN (kX/2).
i=1
Using the series expansion of the sin function for an arbitrary

constant ,
sinc(k) =
1
k
=1
1
1
1
(k)3 + (k)5 (k)7 + . . .
3!
5!
7!
1
1
1
(k)2 + (k)4 (k)6 + . . .
3!
5!
7!
The N th power of sinc(k) can be written in terms of a

binomial expansion giving
sincN (k) =
1
1
1
(k)2 (k)4 + (k)6 . . .
3!
5!
7!
=1N
+
1
1
1
(k)2 + (k)4 (k)6 + . . .
3!
5!
7!
N (N 1)
2!
1
1
1
(k)2 (k)4 + (k)6 . . .
3!
5!
7!
(k)2
(k)4
(k)6
+
...
3!
5!
7!
N (N 1)(N 2)
3!
+...
A PPENDIX I
C ENTRAL L IMIT T HEOREM FOR A U NIFORM
D ISTRIBUTION
We study the effect of applying multiple convolutions of the
uniform distribution
P (x) =
1
X,
0,
2 2
=1N
... +
| x | X/2;
otherwise
and show that
Pi (x) P1 (x) P2 (x) ... PN (t)
i=1
6
exp(6x2 /XN )
N
where Pi (x) = P (x), n and N is large. by considering
the effect of multiple convolutions in Fourier space (through
N (N 1)
2!
4 k 4
6 k 6
+ ...
2
(3!)2
3!5!
N (N 1)(N 2)
3!
=1
k
4 k 4
6 k 6
+N
k
3!
5!
7!
N 2 2
k +
3!
6 k 6
+ . . . + ...
(3!)3
N 4 N (N 1) 4
+
k4
5!
2!(3!)2
N 6 N (N 1) 6 N (N 1)(N 2) 6
+
+
k6 + . . .
7!
3!5!
3!(3!)3
Now the series representation of the exponential (for an

arbitrary positive constant c) is
exp(ck 2 ) = 1 ck 2 +
1 2 4
1
c k c3 k 6 + . . .
2!
3!

56
Equating terms involving k 2 , k 4 and k 6 it is clear that

(evaluating the factorials),
where we have use the result
exp(y 2 )dy =
1
N 2 ,
6
1
1
N + N (N 1) 4
120
72
c=
1 2
c =
2
or
c2 =
and
1 3
c =
6
or
Thus, we can write

N
4 ,
Consider the iterative system

6 .
2n + O(N n1 2n ).
Now, for large N , the rst term in the equation above

dominates to give the following approximation for the constant
c,
1
c
N 2 .
6
We have therefore shown that the N th power of the sinc(k)
function approximates to a Gaussian function (for large N ),
i.e.
1
sincN (k) exp N2 k2 .
6
Thus, if =
X
2,
X
N k2
24
1
exp XN k 2 exp(ikx)dk
24
exp
XN
k
24
24 ix
XN 2
+ix
6 6x2
e XN
XN
6
XN
2
6
XN
after making the substitution

y=
XN k
ix
6 2
6
.
XN
By Cauchys theorem
1
I=
Noting that
6 6x2
e XN
XN
ez dz =
n+1
ln
n=1
n+1
= N .
Thus, we can dene as

1
N N
ln
= lim
n=1
n+1
The constant is known as the Lyapunov exponent. Since we

can write
6x2
dk
+
N
XN
1
= lim
(ln n+1 ln n )
N N
n=1
ey dy
+ix
= c exp(n)
where c is an arbitrary constant. Then 1 = c, 2 = 1 exp(),

3 = 1 exp(2) = 2 exp() and thus, in general, we can
write
n+1 = n exp().
we can write
1
=
n+1
1
I=
2
where n is a perturbation to the value of f at an iterate

n which is independent of the value of f0 . If the system
converges to f as n then n 0 as n and
the system is stable. If this is not the case, then the system
may be divergent or chaotic. Suppose we model n in terms
of an exponential growth ( > 0) or decay ( < 0) so that
ln
approximately. The nal part of the proof is therefore to

Fourier invert the function exp(XN k 2 /24), i.e. to compute
the integral
fn+1 = F (fn ) = f +
then
Q(x) exp
1
=
2
A PPENDIX II
T HE LYAPUNOV D IMENSION
Thus, by deduction, we can conclude that

1
N
6
6
exp[6x2 /(XN )]
XN
Pi (x)
i=1
1
1
1
N3
N2 +
N
216
1080
2835
cn =
for large N .
N
N (N 1) N (N 1)(N 2)
+
+
5040
720
1296
c3 =
Q(x) =
1 2
1
N N
36
90
6x2
6
e XN
XN
and noting that (using forward differencing)

d
ln n+1 ln n
ln
= ln n+1 ln n , x = 1
dx
x
we see that is, in effect, given by the mean value of the
derivatives of the natural logarithm of . Note that, if the value
of is negative, then the iteration is stable and will approach
f since we can expect that as N , n+1 / n < 1 and,
thus, ln( n+1 / n ) < 0. If is positive, then the iteration
will not converge to f but will diverge or, depending on
the characteristics of the mapping function F , will exhibit
chaotic behaviour. The Lyapunov exponent is a parameter that

57
is a characterization of the chaoticity of the signal fn . In

particular, if we compute N using N elements of the signal
fn and then compute M using M elements of the same signal,
we can dene the Lyapunov dimension as
DL =
1
1
N
M
M
N
where
1
N N
, M > N ;
, M < N .
N
N = lim
ln
n=1
n+1
ACKNOWLEDGMENT
Some aspects of this paper are based on the research and
PhD Theses of the following former research students of the
author: Dr S Mikhailov, Dr N Ptitsyn, Dr D Dubovitski, Dr
K Mahmoud, Dr N Al-Ismaili and Dr R Marie. The author is
grateful to the following for their contributions to CrypsticTM
Limited: Mr Bruce Murray, Mr Dean Attew, Major General
John Holmes, Mr William Kidd, Mr David Bentata and Mr
Alan Evans.
R EFERENCES
[1] S. Singh, The Code Book: The Evolution of Secrecy from Mary, Queen
of Scots to Quantum Cryptography, Doubleday, 1999.
[2] B. Schneier, Beyond Fear: Digital Security in a Networked World, Wiley,
2000.
[3] B. Schneier, Thinking Sensibly about Security in an Uncertain World,
Copernicus Books, 2003.
[4] N. Ferguson and B. Schneier B, Practical Cryptography, Wiley, 2003.
[5] A. J. Menezes, P. C. van Oorschot and S. A. Vanstone, Handbook of
Applied Cryptography, CRC Press, 2001.
[6] B. Schneier, Applied Cryptography, Second Edition Wiley, 1996.
[7] J. Buchmann, Introduction to Cryptography, Springer 2001.
[8] O. Goldreich, Foundations of Cryptography, Cambridge University
Press, 2001.
[9] J. Hershey, Cryptography Demystied, McGraw-Hill, 2003.
[10] H. F. Gaines, Cryptanalysis, Dover, 1939.
[11] N. V. Ptitsyn, Deterministic Chaos if Digital Cryptography, PhD Thesis,
De Montfort University, 2003.
[12] http://vl.fmnet.info/safety/
[13] http://www.amazon.com/Network-Security-process-not-product
[14] http://en.wikipedia.org/wiki/Enigma Machine
[15] S. Katzenbeisser and F. Petitcolas, Information Hiding Techniques for
Steganography and Digital Watermarking, Artech House, 2000.
[16] N. F. Johnson, Z. Duric and S. Jajodia, Information Hiding: Steganography and Watermarking -Attacks and Countermeasures, Kluwer Academicf Publishers, 2001.
[17] G. Kipper, Investigators Guide to Steganography, CRC Press, 2004.
[18] Hackers Black Book, http://www.hackersbook.com
[19] A. N. Shulsky and G. J. Schmitt, Silent Warefare: Understanding the
World of Intelligence, Brassey, 2002.
[20] R. Hough, The Great War at Sea, Oxford University Press, 1983
[21] P. G. Halpern, A Naval History of World War One, Routledge, 1994.
[22] R. A. Ratcliff, Delusions of Intelligence, Cambridge University Press,
2006.
[23] R. A. Woytak, On the Boarder of War and Peace: Polish Intelligence
and Diplomacy and the Origins of the Ultra-Secret, Columbia University
Press, 1979.
[24] W. Kozaczuk, Enigma: How the German Machine Cipher was Broken,
and how it was Read by the Allies in World War Two, University
Publications of America, 1984.
[25] B. Booss-Bavnbek and J. Hoyrup, Mathematics at War, Birkh user,
a
2003.
[26] B. J. Copelend, Colossus: The Secrets of Bletchley Parks Code Breaking
Computers, Oxford University Press, 2006.
[27] A. Stripp and F. H. Hinsley, Codebreakers: The Inside Story of Bletchley
Park, Oxford University Press, 2001
[28] http://www.gchq.gov.uk/
[29] W. R. Harwood, The Disinformation Cycle: Hoaxes, Delusions, Security

Beliefs, and Compulsory Mediocrity, Xlibris Corporation, 2002.
[30] R. Miniter, Disinformation, Regnery Publishing, 2005.
[31] T. Newark and J. F. Borsarello, Book of Camouage Brasseys, 2002.
[32] J. M. Blackledge, B. Foxon and M. Mikhailov, Fractal Modulation
Techniques for Digital Communications Systems, Proceedings of IEEE
Conference on Military Communications, October 1998, Boston, USA.
[33] J. M. Blackledge, S. Mikhailov and M. J. Turner, Fractal Modulation
and other Applications from a Theory on the Statistics of Dimension,
Fractal in Multimedia (Eds M F Barnsely, D Saupe and E R Vrscay),
The IMA Volumes in Mathematics and its Applications, Springer, 2002,
175-195.
[34] H. Gerrad and P. D. Antill, Crete 1941: Germanys Lightning Airborne
Assault, Osprey Publishing, 2005.
[35] http://eprint.iacr.org/1996/002.
[36] J. Buchmann, Introduction to Cryptography, Springer, 2001.
[37] H. Delfs and H. Knebl, Introduction to Cryptography: Principles and
Applications, Springer, 2002.
[38] V. V. Ashchenko, V. V. Jascenko and S. K. Lando, Cryptography: An
Introduction, American Mathematical Society, 2002.
[39] A. Salomaa, Public Key Cryptography, Springer, 1996.
[40] Articsoft Technologies, Introduction to Encryption, 2005;
http://www.articsoft.com/wp explaining encryption.htm.
[41] C. Ellison and B. Shneier, Ten Risks of PKI: What Your Not Being Told
About Public Key Infrastructure, Computer Security Journal XVI(1),
2000; http://www.schneier.compaper-pki.pdf.
[42] P. Garrett, Making, Braking Codes, Prentice Hall, 2001.
[43] P. Reynolds, Breaking Codes: An Impossible Task?, 2004;
http://news.bbc.co.uk/1/hi/technology/3804895.stm.
[44] A. G. Webster, Partial Differential Equations of Mathematical Physics,
Stechert, 1933.
[45] P.M. Morse and H. Feshbach, Methods of Theoretical Physics, McGrawHill, 1953.
[46] E. Butkov, Mathematical Physics, Addison-Wesley, 1973.
[47] G.A. Evans, J. M. Blackledge and P. Yardley, Analytical Solutions to
Partial Differential Equations, Springer, 1999.
[48] G. F. Roach, Greens Functions (Introductory Theory with Applications),
Van Nostrand Reihold, 1970.
[49] I. Stakgold, Greens Functions and Boundary Value Problems, Wiley,
1979.
[50] P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford University Press, 1947.
[51] R. F. Hoskins, The Delta Function, Horwood Publishing, 1999.
[52] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2005.
[53] A. Papoulis, The Fourier Integral and its Applications, McGraw-Hill,
1962.
[54] R. N. Bracewell, The Fourier Transform and its Applications, McGrawHill, 1978.
[55] G. P. Wadsworth and J. G. Bryan, Introduction to Probability and
Random Variables, McGraw-Hill, 1960.
[56] B. L. van der Waerden, Mathematical Statistics, Springer-Verlag, 1969.
[57] R. G. Laha and E. Lukacs, Applications of Characteristic Functions,
Grifn, 1964.
[58] E. J. Watson, Laplace Transforms and Applications, Van Nostrand
Reinhold, 1981.
[59] D. Wackerly, R. L. Scheaffer and W. Mendenhall, Mathematical Statistics with Applications (6th Edition), Duxbury, May 2001.
[60] S. S. Wilks, Mathematical Statistics, Wiley, 1962.
[61] M. Born and E. Wolf, Principles of Optics (6th Edition), Pergamon
Press, Oxford, 1980.
[62] E. G. Steward, Fourier Optics: An Introduction, Horwood Scientic
Publishing, 1987.
[63] M. V. Klein and T. E. Furtak, Optics, Wiley, 1986.
[64] E. Hecht, Optics, Addison-Wesley, 1987.
[65] I. J. Cox, M. L. Miller and J. A. Bloom, Digital Watermarking, Morgan
Kaufmann, 2002.
[66] B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, 1983.
[67] M. F. Barnsley, R. L. Dalvaney, B. B. Mandelbrot, H. O. Peitgen,
D. Saupe and R. F. Mandelbrot, The Science of Fractal Images, Springer,
1988.
[68] M. J. Turner, J. M. Blackledge and P. R. Andrews, Fractal Geometry in
Digital Imaging, Academic Press, 1997.
[69] C. E. Shannon, A Mathematical Theory of Communication, Bell System
Technical Journal, 27, 379-423 (July), 623-656 (October), 1948.
[70] J. Sethna, Statistical Mechanics : Entropy, Order Parameters and
Complexity, Oxford University Press, 2006.
58

[71] B. B. Buck and V. A. Macaulay (Eds.), Maximum Entropy in Action,

Clarendon Press, 1992.
[72] http://www.freedownloadscenter.com/Best/des3-source.html
[73] http://csrc.nist.gov/publications/ps/ps197/ps-197.pdf
[74] E. Beham, Cryptanalysis of the Chaotic-map Cryptosystem, Suggested
at EUROCRYPT91, Technical paper, 1991,
http://citeseer.nj.nec.com/175190.html
[75] M. E. Bianco and D. Reed, An Encryption System Based on Chaos
Theory, US Patent No. 5048086, 1991.
[76] L. J. Kocarev, K. S. Halle, K. Eckert and L. O. Chua, Experimental
Demonstration of Secure Communications via Chaotic Synchronization,
IJBC, 2(3), 709-713, 1992.
[77] T. Caroll and L. M. Pecora, 1990, Synchronization in Chaotic Systems,
Phys. Rev. Letters, 64(8), 821-824, 1990.
[78] T. Caroll and L. M. Pecora, Driving Systems with Chaotic Signals, Phys.
Rev. A44(4), 2374-2383, 1991.
[79] T. Caroll and L. M. Pecora, A Circuit for Studying the Synchronization
of Chaotic Systems, Journal of Bifurcation and Chaos, 2(3), 659-667,
1992.
[80] J. M. Carroll, J. Verhagen and P. T. Wong, Chaos in Cryptography: The
Escape From the Strange Attractor, Cryptologia, 16(1), 52-72, 1992.
[81] M. S. Baptista, Cryptography with Chaos, Physics Letters A, 240(1-2),
50-54, 1998.
[82] E. Alvarez, A. Fernandez, P. Garcia, J. Jimenez and A. Marcano, New
Approach to Chaotic Encryption, Physics Letters A, 263(4-6), 373-375,
1999.
[83] L. Cappelletti, An FPGA Implementation of a Chaotic Encryption
Algorithm, Bachelor Thesis. Universit` Degli Studi di Padova, 2000
a
http://www.lcappelletti.f2s.com/Didattica/thesis.pdf
[84] L. Kocarev, Chaos-based Cryptography: a Brief Overview, Journal of
Circuits and Systems, 1(3), 6-21, 2001.
[85] Y. H. Chu and S. Chang, Dynamic cryptography based on synchronized
chaotic systems, Electronic Letters, 35(12), 1999.
[86] Y. H. Chu and S. Chang, Dynamic data encryption system based on
synchronized chaotic systems, Electronic Letters, 35(4), 1999.
[87] F. Dachselt, K. Kelber and W. Schwarz, Chaotic Coding and Cryptanalysis, 1997, http://citeseer.nj.nec.com/355232.html
[88] J. Fridrich, Secure Image Ciphering Based on Chaos, Final Technical
Report. USAF, Rome Laboratory, New York, 1997.
[89] J. B. Gallagher and J. Goldstein, Sensitive dependence cryptography,
Technical Report, 1996, http://www.navigo.com/sdc/
[90] Gao, Gaos Chaos Cryptosystem Overview, Technical Report, 1996,
http://www.iisi.co.jp/ppt/enggcc/
[91] N. V. Ptitsyn, J. M. Blackledge and V. M. Chernenky, Deterministic
Chaos in Digital Cryptography, Proceedings of the First IMA Conference on Fractal Geometry: Mathematical Methods, Algorithms and
Applications (Eds. J M Blackledge, A K Evans and M Turner), Horwood
Publishing Series in Mathematics and Applications, 189-222, 2002.
[92] R. Matthews, On the derivation of a chaotic encryption algorithm,
Cryptologia, (13): 2942, 1989.
[93] http://www.x-ways.net/winhex/
[94] http://www.wellresearchedreviews.com/computer-monitoring/
[95] http://www.nist.gov/
[96] R. Marie, Fractal-Based Models for Internet Trafc and their Application to Secure Data Transmission, PhD Thesis, Loughborough
University, 2007.
[97] R. Marie, J. M. Blackledge and H. Bez, Characterisation of Internet
Trafc using a Fractal Model, Proc. 4th IASTED Int. Conf. on Signal
Processing, Pattern Recognition and Applications, Innsbruck, 2007, 487501.
Jonathan Blackledge received a BSc in Physics

from Imperial College, London University in 1980,
a Diploma of Imperial College in Plasma Physics
in 1981 and a PhD in Theoretical Physics from
Kings College, London University in 1983. As a Research Fellow of Physics at Kings College (London
University) from 1984 to 1988, he specialized in
information systems engineering undertaking work
primarily for the defence industry. This was followed
by academic appointments at the Universities of
Craneld (Senior Lecturer in Applied Mathematics)
and De Montfort (Professor in Applied Mathematics and Computing) where
he established new post-graduate MSc/PhD programmes and research groups
in computer aided engineering and informatics. In 1994, he co-founded
Management and Personnel Services Limited where he is currently Executive
Director. His work for Microsharp (Director of R & D, 1998-2002) included
the development of manufacturing processes now being used for digital
information display units. In 2002, he co-founded a group of companies
specializing in information security and cryptology for the defence and
intelligence communities, actively creating partnerships between industry and
academia. He currently holds academic posts in the United Kingdom and
South Africa, and in 2007 was awarded Fellowships of the City and Guilds
London Institute and the Institute of Leadership and Management together
with Freedom of the City of London for his role in the development of the
Higher Level Qualication programmes in Engineering, ICT and Business
Administration, most recently, for the nuclear industry, security and nancial
sectors respectively.
Regular Paper
59

Karlsson M. and Gong S.: Monopole and Dipole Antennas for UWB Radio Utilizing
a Flex-rigid Structure
Monopole and Dipole Antennas for UWB Radio

Utilizing a Flex-rigid Structure
Magnus Karlsson, and Shaofang Gong, Member, IEEE
AbstractFully integrated monopole and dipole antennas for
ultra-wideband (UWB) radio utilizing flexible and rigid printed
circuit boards are presented in this paper. A circular monopole
antenna for the entire UWB frequency band 3.1-10.6 GHz is
presented. A circular dipole antenna with an integrated balun for
the frequency band 3.1-4.8 GHz is also presented. The balun
utilizes broadside-coupled microstrips, integrated in the rigid
part of the printed circuit board. Furthermore, an
omnidirectional radiation pattern and high radiation efficiency
are predicted by simulations.
Index TermsBroadside-coupled, circular, dipole antenna,
monopole antenna, UWB
I. INTRODUCTION
ltra-wideband (UWB) radio has gained popularity in

recent years [1]-[8]. The entire UWB frequency-band
used for short range high speed communications have been
defined between 3.1-10.6 GHz [1]-[5]. Ever since the effort to
achieve one sole UWB standard halted in early 2006, two
dominating UWB specifications have remained as top
competitors [8]. One is based on the direct sequence spread
spectrum technique [4], [7]-[8]. The other is based on the
multi-band orthogonal frequency division multiplexing
technique (Also known as Wimedia UWB, supported by
Wimedia alliance) [5]-[6], [9]-[10]. The multi-band
specification divides the frequency spectrum into 500 MHz
sub-bands (528 MHz including guard carriers and 480 MHz
without guard carriers, i.e., 100 data carriers and 10 guard
carriers). The three first sub-bands centered at 3.432, 3.960,
and 4.488 GHz form the so-called Mode 1 band group (3.14.8 GHz) [4]-[8], [10].
All the research efforts that have been made during the era
of UWB antenna development have resulted in many ideas for
good wideband antennas [11]-[15]. However, the general
focus has so far been on the antenna element but not so much
on how the antenna can be integrated in a UWB system.
Utilizing a flexible substrate the antenna can be bent and
placed in many ways without a major distortion of the antenna
performance [16]. In this paper the concept of utilizing a
flexible and rigid (flex-rigid) substrate is presented. Using this
flex-rigid concept the antenna is made on the flexible part of
Manuscript received Oct. 23, 2007. Ericsson AB in Sweden is
Magnus Karlsson; email: magka@itn.liu.se, and Shaofang Gong are with
Linkping University, Sweden.
the flex-rigid structure. In the rigid part other circuitries are

designed and placed as with any other regular multi-layer
printed circuit board. For instance, in this paper the dipole
antenna balun is placed in the rigid part.
II. OVERVIEW OF THE SYSTEM
As shown in Fig. 1 all prototypes were manufactured using
a four metal layer flex-rigid printed circuit board. Two duallayer NH9326 boards were processed together with a
polyimide-based flexible substrate. The rigid and the flexible
substrates are bonded together in a printed circuit board
bonding process. The laminates are made of sheet material
(e.g., glass fabric) impregnated with a resin cured to an
intermediate stage, ready for multi-layer printed circuit board
bonding.
Table 1. Printed circuit board parameters
Parameter (Polyimide)
Dimension
Dielectric height
0.1524 mm
Dielectric constant
3.40.05
Dissipation factor
0.002
Parameter (NH9326)
Dimension
Dielectric height
0.254 mm
Dielectric constant
3.260.1
Dissipation factor
0.0025
Parameter (Metal, common)
Dimension
Metal thickness, layer 1, 4
0.035 mm
0.025 mm
7
Metal conductivity
5.8x10 S/m (Copper)
Surface roughness
0.001 mm
Table 1 lists the printed circuit board parameters, with the

stack of the printed circuit board layers shown in Fig. 1a.
Metal layers 1 and 4 are thicker than metal layers 2 and 3
because the surface layers are plated twice (the embedded
metal layers 2 and 3 are plated once). Fig. 1b shows the
advantage using the flex-rigid concept, i.e., the bendable
property of the flex-rigid substrate.
Metal 1
Rigid
Flex
Rigid
Metal 4
(a) Substrate cross-section.
Metal 2: antenna
Metal 3: ground

60
Rigid
C. Balun used with the dipole antennas

Fig. 4a shows the illustration of the broadside-coupled
microstrips. Fig. 4b shows the proposed balun structure. The
balun is used together with the dipole antennas and built with
the broadside-coupled microstrips. By implementing the balun
in a multilayer structure a more compact design is achieved.
Flex
Rigid
(b) Bendable property.
Microstrip lines
Metal 1
Metal 2
Ground
Fig. 1. Printed circuit board structure: (a) detailed cross-section, and (b)
bendable property.
A. Monopole antenna
Fig. 2 shows a circular monopole antenna integrated in the
flex-rigid substrate. As shown in Fig. 2, the ground plane is
integrated in the rigid part. The radiating antenna element is
placed entirely on the flex part of the substrate. The circular
antenna geometry provides good omni-directionality [11].
Polyimide foil
(a) Broadside-coupled microstrips.
Differential feed-line
Port 2, Metal 2
Port 3, Metal 2
/2
Circular
monopole
antenna
Metal 2 connected
to ground
30 mm
Single ended feed-line

(Port 1, Metal 1)
(b) Broadside-coupled balun.

Fig. 4. Balun: (a) broadside-coupling, and (b) a broadside-coupled balun.
Ground plane
(rigid part)
Single-ended
feed-line
Fig. 2. Monopole antenna.
B. Dipole antenna
Fig. 3 shows a circular dipole antenna integrated in the flexrigid substrate. The radiating antenna element is placed
entirely on the flex part of the substrate. Furthermore, the
balun is integrated in the rigid part of the substrate. The balun
utilizes broadside-coupled microstrips [13].
Circular dipole
antenna
III. SIMULATED RESULTS

Design and simulation were done with ADS2006A from
Agilent Technologies Inc. Electromagnetic simulations were
done with Momentum, a built-in 2.5D method of moment
field solver.
A. Monopole antenna
Fig. 5 shows voltage standing wave ratio (VSWR)
simulation of a circular monopole antenna on the flex-rigid
substrate. It is seen that the designed circular monopole
antenna shown in Fig. 2 has a wide impedance bandwidth
using the proposed flex-rigid structure. It covers the entire
UWB frequency-band 3.1-10.6 GHz at VSWR<2.
5
Polyimide
foil
Rigid part
3
2
Single ended
feed-line
Fig. 3. Circular dipole antenna.
VSWR
38 mm
Frequency (GHz)
Fig. 5. VSWR simulation of the monopole antenna.
10
11

61
B. Dipole antenna
Fig. 6 shows VSWR simulation of a circular dipole antenna
on the flex-rigid substrate. Fig. 6a shows the VSWR
simulation result without the balun, and Fig. 6b shows the
simulation result with the balun. It is seen that the circular
dipole antenna has a wide impedance bandwidth using the
suggested flex-rigid structure. Furthermore, it is seen that the
balun is the component limiting the bandwidth.
=0
0
0
330
30
-5
-10
-15
300
60
-20
-25
-30 270
90
-25
-20
-15
240
120
-10
4
VSWR
-5
150
180
(a) Normalized E radiation pattern at 3.432 GHz, =0.
2
1
210
=0
10
Frequency (GHz)
330
30
-5
-10
(a) VSWR simulation without balun.
-15
300
60
-20
-25
-30 270
90
-25
VSWR
-20
-15
240
120
-10
-5
2
1
210
10
150
180
(b) Normalized E radiation pattern at 3.960 GHz, =0.
Frequency (GHz)
(b) VSWR simulation with balun.

Fig. 6. Dipole antenna: (a) VSWR simulation without balun, and (b) VSWR
simulation with balun.
330
30
-5
-10
-15
Fig. 7 shows radiation simulation of the circular dipole

antenna. The radiation patterns are similar in the three subbands of 3.432, 3.960, and 4.488 GHz, as seen in Figs 7a-7c.
The pattern becomes slightly more focused at higher
frequencies, which is expected since the physical size is larger
compared to the wavelength at the higher frequencies [17].
=0
0
0
300
60
-20
-25
-30 270
90
-25
-20
-15
240
120
-10
-5
0
210
150
180
(c) Normalized E radiation pattern at 4.488 GHz, =0.

Fig. 7. Dipole antenna simulations, normalized radiation pattern: (a) radiation
pattern at 3.432 GHz, =0, (b) radiation pattern at 3.960 GHz, =0, and (c)
radiation pattern at 4.488 GHz, =0.
Table 2 lists simulated gain and radiation efficiency. It is

seen that both the monopole and the dipole antennas have high
radiation efficiency when implemented using the flex-rigid
structure. The simulated gain of the dipole antenna is slightly
62
above 2 dBi as expected. The monopole has even higher gain

but this is due to the 2.5D simulation that uses an infinitively
large ground-plane [11], [17]. However, the VSWR still ought
to be correct since the minimum distance to the ground-plane
is correct [17].
Table 2. Simulated maximum co-polarization antenna gain and efficiency
Frequency (GHz)
3.432
3.960
4.488
Gain, Monopole (dBi)
4.139
3.572
4.854
Gain, Dipole (dBi)
2.272
2.459
2.581
Efficiency, Monopole (%)
99.68
98.78
99.99
Efficiency, Dipole (%)
95.17
94.65
93.82

S21
S31
-10
-20
-30
-40
-50
Frequency (GHz)
C. Balun used with the dipole antenna

Fig. 8 shows simulations of the proposed broadside-coupled
balun shown in Fig. 4. It is seen in Fig. 8a that the balun has
an insertion loss (IL) less than 0.8 dB in the Mode 1 UWB
frequency-band. Fig. 8b shows a rather symmetric
performance of the two signal paths. S21 and S31 are singleended forward transmissions, from Port 1 to Port 2 and 3,
respectively. However, it is seen that above the Mode 1 UWB
frequency bandwidth (3.1-4.8 GHz) the IL increases. Fig. 8c
shows simulated phase balance (single-ended S21 and S31). It
is seen that the phase shift is linear, and the phase difference is
close to 90 between 2.0 to 7.1 GHz. It is noticed that a small
notch is seen at 3.05 GHz. It occurs when the total length from
the antenna feed-point (Port 2 and Port 3 in Fig. 4b) to the
grounded end of each path of the balun is equal to one quarter
wavelength. In this work, the length of the differential antenna
feed-line was optimized so that the small notch fell below the
UWB frequency-band.
0
-10
-20
-30
-40
-50
Frequency (GHz)
(a) Simulation of forward transmission.
Phase ()
(b) Simulation of forward transmission.
S21
200
150
100
50
0
-50
-100
-150
-200
S31
Frequency (GHz)
(c) Simulation of phase balance.
Fig. 8. Balun: (a) simulated forward transmission, single-differential port, (b)
simulated forward transmission, single-single port, and (c) simulated phase.
IV. DISCUSSION
Simulation results show that the monopole antenna has a
wide operational frequency band. The monopole antenna is
smaller than the dipole antenna but needs a quite large ground
plane. On the contrary, the dipole antenna only requires a
small ground plane beneath the balun, but the total size is
larger. The simulations of the circular dipole antenna indicate
that the antenna has a typical radiation pattern as expected
from a common dipole antenna. It is also observed (Fig. 6)
that the balun limits the bandwidth if the dipole antenna is
used with a single-ended port. However, this antenna with the
balun (Fig. 6b) has a good bandpass characteristic, covering
the frequency band 3.1-4.8 GHz required by the Mode 1
UWB specification.
V. CONCLUSION
Simulations show that a circular monopole antenna can be
implemented on a flex-rigid substrate with VSWR<2.0 over
the entire UWB frequency bandwidth (3.1-10.6 GHz).
Moreover, a circular dipole antenna implemented using the
flex-rigid substrate can cover the Mode 1 UWB frequencybandwidth (3.1-4.8) at VSWR<1.54 without a balun and
VSWR<1.68 with a balun.

63
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]

ultra-wideband transmission systems FCC., Washington, 2002.
G. R. Aiello and G. D. Rogerson, Ultra Wideband Wireless Systems,
IEEE Microwave Magazine, vol. 4, no. 2, pp. 36-47, Jun. 2003.
M. Karlsson and S. Gong, Wideband patch antenna array for multiband UWB, Proc. IEEE 11th Symp. on Communications and Vehicular
Tech., Ghent, Belgium, Nov. 2004.
L. Yang and G. B. Giannakis, Ultra-Wideband Communications, an
Idea Whose Time has Come, IEEE Signal Processing Magazine, pp.
26-54, Nov. 2004.
M. Karlsson and S. Gong, An integrated spiral antenna system for
UWB, Proc. IEEE 35th European Microwave Conf., Paris, France, Oct.
2005, pp 2007-2010.
J. Balakrishnan, A. Batra, and A. Dabak, A multi-band OFDM system
for UWB communication, Proc. Conf. Ultra-Wideband Systems and
Technologies, Reston, VA, 2003, pp.354358.
W. D. Jones, "Ultrawide gap on ultrawideband," IEEE Spectrum, vol.
41, no. 1, pp. 30, Jan. 2004.
D. Geer, "UWB standardization effort ends in controversy," Computer,
vol. 39, no. 7, pp. 13-16, July 2006.
S. Chakraborty, N. R. Belk, A. Batra, M. Goel, A. Dabak, "Towards
fully integrated wideband transceivers: fundamental challenges,
solutions and future," Proc. IEEE Radio-Frequency Integration
Technology: Integrated Circuits for Wideband Communication and
Wireless Sensor Networks 2005, pp. 26-29, 2 Dec. 2005.
Geer, D., "UWB standardization effort ends in controversy," Computer,
vol.39, no.7, pp. 13-16, July 2006.
H. Schantz, "The Art and Science of Ultrawideband Antennas," Artech
House Inc., ISBN: 1-58053-888-6, 2005.
M. Karlsson, P. Hkansson, A. Huynh, and S. Gong, Frequencymultiplexed Inverted-F Antennas for Multi-band UWB, IEEE Wireless
and Microwave Conf. 2006, pp. 2.1-2.3, 2006.
M. Karlsson, and S. Gong, "A Frequency-Triplexed Inverted-F Antenna
System for Ultra-wide Multi-band Systems 3.1-4.8 GHz," Accepted for
publication in ISAST Transactions on Electronics and Signal
Processing, 2007.
Z. N. Chen, M. J. Ammann, X. Qing; X. H. Wu, T. S. P. See, A. Cai,
"Planar antennas," Microwave Magazine, IEEE, vol. 7, no. 6, pp. 63-73,
Dec. 2006.
W. S. Lee, D. Z. Kim, K. J. Kim; K. S. Son, W. G. Lim, J. W. Yu,
"Multiple frequency notched planar monopole antenna for multi-band
wireless systems," Proc. IEEE 35th European Microwave Conf., Paris,
France, Oct. 2005, pp 535-537.
B. Kim, S. Nikolaou, G. E. Ponchak, Y.-S. Kim, J. Papapolymerou, M.
M. Tentzeris, "A curvature CPW-fed ultra-wideband monopole antenna
on liquid crystal polymer substrate using flexible characteristic," IEEE
Antennas and Propagation Society Int. Symp. 2006, pp. 1667-1670, 9-14
July 2006.
V. F. Fusco, Foundations of Antenna Theory and Techniques, Edinburgh
Gate, Harlow, Essex, England, Pearson Education Limited, pp. 45, 2005.
Magnus Karlsson was born in Vstervik, Sweden

in 1977. He received his M.Sc. and Licentiate of
Engineering from Linkping University in
Sweden, in 2002 and 2005, respectively.
Communication Electronics research group at
Linkping University. His main work involves wideband antennatechniques, wideband transceiver front-ends, and wireless
communications.
Shaofang Gong was born in Shanghai, China, in
1960. He received his B.Sc. degree from Fudan
University in Shanghai in 1982, and the Licentiate
of Engineering and Ph.D. degrees from Linkping
University in Sweden, in 1988 and 1990,
respectively.
Between 1991 and 1999 he was a senior researcher at the
microelectronic institute Acreo in Sweden. From 2000 to 2001 he
was the CTO at a spin-off company from the institute. Since 2002 he
has been full professor in communication electronics at Linkping
University, Sweden. His main research interest has been
communication electronics including RF design, wireless
communications and high-speed data transmissions.
Regular Paper
64

Karlsson M. and Gong S.: Monofilar spiral antennas for multi-band UWB system
with and without air core
Monofilar spiral antennas for multi-band UWB

system with and without air core
Magnus Karlsson and Shaofang Gong
Linkping University, Department of Science and Technology - ITN, LiU Norrkping,
SE-601 74 Norrkping, Sweden, Phone +46 11363491
Abstract One of the trends in wireless communication

is that systems require more and more frequency spectrum.
Consequently, the demand for wideband antennas increases
as well. For instance, the ultra wideband radio (UWB)
utilizes the frequency band of 3.1-10.6 GHz. Such a
bandwidth is more than what is normally utilized with a
single low-profile antenna. Low profile antennas are
popular because they are integratable on a printed circuit
board. However, the fractional bandwidth is usually an
issue for low profile antennas because of the limited
substrate height. The monofilar spiral antenna on the other
hand has higher fractional bandwidth, and at GHz
frequencies the physical dimensions of the spiral is
reasonable. Furthermore, a study of how spiral dimensions
impact on antenna gain and standing wave ratio (SWR) was
conducted and presented. Simulated results were compared
with measurements.
has been specified in the frequency range 3.1 - 10.6 GHz

[4].
II. OVERVIEW OF THE ANTENNA
A. Monofilar Spiral Antenna
Fig. 1 shows a photo of one of the monofilar spiral
antenna prototypes presented in this paper. No matter
which antenna technique used, most antennas should be
connected to a 50 port. The soldered 50 feed point is
seen in the center of the spiral. Table I lists antenna
specifications.
I. INTRODUCTION
Wireless communication systems require more and
more spectrums, while single module solutions grow in
popularity. An integratable low-profile antenna structure
limits design options, regarding various performance
aspects. For instance, the fractional bandwidth is limited.
A monofilar spiral antenna is known to have wide
bandwidth, i.e., good fractional bandwidth compared to
other planar antennas [1-4]. One known drawback is the
physical size of a plane monofilar spiral antenna, i.e., the
single arm spiral is physically large in terms of its
diameter [3, 4]. Furthermore it is known that even though
the spiral may have very wide impedance bandwidth,
dispersion may cause problems for using spiral antennas
in wideband impulse-systems [1, 2, 5]. However, in
multi-band systems the antennas need only to transmit
and receive a signal with limited bandwidth, i.e., the
width of a single multi-band pulse determines the
maximum frequency spectrum used simultaneously [6].
Flat antenna performance is desired in UWB systems [7].
A terminating resistor at the arm end may be used to
reduce reflections in order to flatten the antenna input
impedance, but it causes signal loss [5]. Our approach is
instead to optimize the physical properties of the antenna
so that the input impedance becomes acceptable close to
a 50 real load. The antennas are intended to be
integrated on a multilayer printed circuit board (PCB). A
concept to reduce the total size of the complete design is
to place components on a layer below the antenna as
proposed in our previous publications [3, 6]. This
requires that the antenna structure is compatible with an
overall module technology. Ultra wideband radio (UWB)
Fig. 1. Photo of a spiral antenna prototype.

Table I. Antenna specifications
Antenna
1.1
1.2
1.3
2.1
2.2
2.3
3.1
3.2
3.3
Turn distance (s)

4.5 mm
5.5 mm
6.5 mm
4.5 mm
5.5 mm
6.5 mm
4.5 mm
5.5 mm
6.5 mm
Radius (r)
75 mm
75 mm
75 mm
50 mm
50 mm
50 mm
30 mm
30 mm
30 mm
B. Material and PCB Structures

Table II. Four layer PCB parameters
Parameter
Dielectric top
Dielectric core
Dielectric bottom
Dielectric constant
Dissipation factor
Metal thickness
Metal thickness
Metal conductivity
Surface roughness
Layer
RCC
RO4350B
RCC
All layers
All layers
Top and bottom
RO4350B (core)
All layers
All layers
Dimension
Height, h=0.05 mm
(1x) h=1.524 mm
h=0.05 mm
3.480.05
0.004
0.025 mm
0.018 mm
5.8x107 S/m
0.001 mm
Table II. lists material parameters used for simulations,

and Fig. 2 illustrates cross-sections of the two PCBstructures. Fig. 2a shows the PCB-structure used for the
antennas without an air core, and Fig. 2b shows the PCBstructure used for all antennas with an air core. Rogers
RO4350B has been used since it is suitable for radio

frequency (RF) modules [6, 8]. The proposed RF module

with an air core is built with two PCBs that are separated
with an additional air gap. Separating the antenna
element from the ground plane with an air gap improves
radiation characteristics [3, 9]. The antenna PCB and the
component PCB heights are chosen separately, and the
number of layers of the component PCB can be chosen
freely. The antenna is fed with one via, i.e., a via through
all layers to the antenna. Since the module is
manufactured as two separate PCBs the via connecting
the antenna feed point is soldered afterwards.
Gnd
Antenna layer Via
Antenna
layer
RO4350B
Air core
hair
Component PCB
Bottom
Component layer
(a) Cross-section.
Component layer
Radiating zone
Antenna
Turn distance, s
Open end
theta ()
Radius, r
Y
phi ()
n/2
Feed point
Fig. 3.
Layout of a monofilar spiral antenna, oriented in the XY
plane, =0.
Via
Antenna PCB
Top
operation is required the issue must be solved with other

techniques.
Gnd
(b) Cross-section.
Fig. 2.
Cross-sections of substrate: (a) four layer PCB, and (b) air
core substrate structure.
C. Principle of the Monofilar Spiral

As shown in Fig. 3, the circumference of the radiation
zone determines the radiation frequency. The
circumference is one for the first radiating zone, where
is the wavelength. When the circumference is >2 the
antenna will radiate a tilted beam [10]. The tilted beam
consists of one component for each wavelength the
circumference reaches, i.e., for circumference 2-3 the
beam consists of two radiating zones and so on. In
general, when one half circumference is equal to n times
/2. Two travelling waves exist on the spiral, forward
and backward. Each wave will radiate when passing
through an appropriate radiation zone [2]. The net result
is the far-field composed of positive and negative modes
coming from these two oppositely directed waves.
Existence of the two waves will determine the antenna
performance in a certain direction at a specific frequency
[10, 11]. A stable wave flow and a smooth standing wave
ratio (SWR) are closely related to each other, if the
substrate loss is low. The antenna has circular
polarization in the frequency range where the total length
of the spiral arm is electrically large when compared to
the wavelength [4, 5].
Feeding to the spiral can be done either from the centre
or from the outside. In the prototypes shown in this paper
the feeding is done by a via to the centre of the spiral, see
Fig. 3. The input impedance depends on the line width
together with the distance to the ground plane, because
the characteristic impedance of the spiral arm is
dependant of the line width as in the case of a microstrip
line. The real part of the input impedance can be
controlled by the line width, while the imaginary part is
more difficult to control. If implemented in a narrow
band system the spiral antenna can be matched using a
classical RF matching technique for optimal performance
in that frequency region. However, if a wideband
D. Methods
Design and simulation were done using the ADS
Momentum, which is a 2.5D planar electromagnetic
(EM) simulator. The antennas simulated are of planar
structure therefore the Momentum-engine is a reliable
choice, despite the lack of surface roughness
consideration [12]. The simulations were done with an
infinite ground plane, imposed by ADS.
Voltage standing wave ratio (VSWR) measurements
were done with a Rhode&Schwartz ZVM vector network
analyzer (NWA). Radiation measurements were done in
an anechoic antenna chamber. An HP 8510C vectorNWA together with an HP 8517B S-parameter test set
and the MiDAS 2.01g software was used to measure the
antennas. Linearly polarized horn antennas with known
antenna gain were used as reference antennas.
III. SIMULATION RESULTS
A. Monofilar Spiral Antennas of 50
Fig. 4 shows the layout and VSWR simulation of a
monofilar spiral antenna. To optimize performance, the
real part of the characteristic impedance was calculated to
be 50 , i.e., the line width was calculated to be 3.43 mm
with a substrate thickness of 1.5 mm. Three radii with
three different turn distances for each size of the antenna
were designed and simulated. It is shown that in all three
simulations in Figs. 4a-4c, a more dense turn distance
(s) provides a high fractional bandwidth. As mentioned
in the introduction, in theory the infinite number of turns
gives the infinite fractional bandwidth. However, owing
to the chosen line width and the need of spacing between
turns and limited number of turns, the bandwidth is
limited in our case as seen in Figs. 4a-4c. Fig. 4d shows a
rather good match between simulated and measured data.
5
s=6.5 mm
s=4.5 mm
4
VSWR
65
s=5.5 mm
2
3
2
1
2
Frequency (GHz)
(a) Simulation, r=30 mm.
10
11

66
s=4.5 mm
s=5.5 mm
s=6.5 mm
3
2
10
11
Frequency (GHz)
s=6.5 mm
-2
-4
s=4.5 mm
Gain (dBi)
VSWR
s=5.5 mm
5
6
7
8
Frequency (GHz)
(a) Antenna gain, r=30 mm.
10
11
(b) Simulation, r=50 mm.
Gain (dBi)
VSWR
4
s=4.5 mm
s=5.5 mm
s=6.5 mm
-4
10
11
s=6.5 mm
-2
2
1
s=5.5 mm
s=4.5 mm
5
6
7
8
Frequency (GHz)
(b) Antenna gain, r=50 mm.
10
11
Frequency (GHz)
(c) Simulation, r=75 mm.

Air core, 1.5 mm RO4350B + 1.5 mm air
VSWR
Simulated
Gain (dBi)
s=4.5 mm
Measured
s=5.5 mm s=6.5 mm
-2
-4
5
6
7
8
Frequency (GHz)
(c) Antenna gain, r=75 mm.
2
1
2
10
10
11
5
6
7
8
9
Frequency (GHz)
(d) Antenna gain, air core, r=75 mm, s=4.5 mm.
10
11
11
B. Monofilar Spiral Antenna Gain Considerations

Fig. 5 shows antenna gain simulations for various
monofilar spiral antennas. Simulations in 5a-5c are done
with substrate definitions exactly displayed in Table II
and Fig. 2a, also called 1x. The simulation marked with
2x has the same material but with a double core layer
height. Thus, the height from the antenna plane to the
ground plane is 1.5 and 3.0 mm, respectively. Figs. 5a-5c
shows a series of simulations about how the distance
between turns affects the antenna gain at various antenna
radii. The optimal density of turns varies with the radius.
Fig. 5d shows how an additional air-gap improves the
antenna gain. Fig. 5e shows how the dielectric loss and
the substrate thickness impact on the antenna gain. The
substrate marked with 2x-air is the 1x substrate with a
1.5 mm air distance added between the ground plane and
the PCB, as seen in Fig. 2b.
Gain (dBi)
2.2 mm air
10
8
6
4
1.5 mm air
2
10
Gain (dBi)
Fig. 4.
VSWR simulation of Monofilar spiral antennas of 50
input impedance: (a) r=30 mm, (b) r=50 mm, (c) r=75 mm, and (d)
simulation and measurement comparison, r=75 mm, s=4.5 mm.
12
Frequency (GHz)
(d) Simulation and measurement comparison, r=75 mm, s=4.5 mm.
2x-air
1x loss-less
2x
-5
1x
-10
2
5
6
7
8
Frequency (GHz)
10
11
(e) Antenna gain vs. different substrates, and heights.

Fig. 5.
Antenna gain simulations: (a) antenna gain, r=30 mm, (b)
antenna gain, r=50 mm, (c) antenna gain, r=75 mm, (d) antenna gain, air
core, r=75 mm, s=4.5 mm, and (e) antenna gain vs. different substrates
and heights.

67
IV. MEASURED RESULTS
Antenna 2.3
VSWR
3
2
3
Antenna 3.1
6
7
8
Frequency (GHz)
s=4.5 mm
5
Frequency (GHz)
(b) Maximum antenna gain.
4
3
10
-4
s=6.5 mm
-2
-6
Fig. 7.
VSWR and antenna gain measurements of two spiral
antennas with a radius of 50 mm: (a) turn distance 6.5 mm, and (b)
maximum antenna gain measured from three sweeps with =0, 45,
and 90, respectively.
3
6
7
8
Frequency (GHz)
10
(a) VSWR measurement, r=30 mm, s=4.5 mm.
2
s=5.5 mm
0
-2
-4
-6
s=4.5 mm
5
Frequency (GHz)
(b) Maximum antenna gain.
Fig. 6.
VSWR and antenna gain measurement of two spiral antennas
with a radius of 30 mm: (a) turn distance 4.5 mm, and (b) maximum
antenna gain measured from three sweeps with =0, 45, and 90,
respectively.
B. Monofilar Spiral Antennar=50 mm

Figs. 7a shows VSWR measurement of a spiral
antenna with a radius of 50 mm. The turn distance is 6.5
mm, for the antenna measured in Fig. 7a. The radius of
50 mm is large enough to provide a VSWR<3 through
the entire UWB frequency-band. The magnitude of the
resonances is smaller than that from the antennas with a
30-mm radius because of less reflections owing to more
turns. It is seen that the VSWR is slightly lower than that
from the simulations shown in Fig. 4b. The antenna with
the largest turn distance has highest gain in the majority
of the measured spectrum, see Fig. 7b.
C. Monofilar Spiral Antennar=75 mm

Figs. 8a-8c show VSWR measurements of three spiral
antennas with a radius of 75 mm. The turn distance is
4.5, 5.5 and 6.5 mm, for the antenna measured in Figs.
8a, 8b and 8c, respectively. It is seen that the antenna 1.1
reaches VSWR<2 at 2.1 GHz. Larger fractional
bandwidth is achieved as the turn distance decreases. The
simulated VSWR deviates more from the measured as the
frequency increases, see Figs. 4c, 8a-8c. However, the
antenna with largest turn-distance has the smoothest
VSWR performance. Fig. 8d shows that the radiation
pattern is dependant of the frequency. It is seen in Fig. 8e
that the 4.5 mm turn distance is the most optimal at low
frequencies, but as the frequency increases the optimal
turn distance increases.
Antenna 1.1
4
VSWR
3
2
3
6
7
8
Frequency (GHz)
10
(a) Measured VSWR, r=75 mm, s=4.5 mm.

Antenna 1.2
4
VSWR
VSWR
Gain (dBi)
(a) Measured VSWR, r=50 mm, s=6.5 mm.
Gain (dBi)
A. Monofilar Spiral Antennar=30 mm

Figs. 6a shows VSWR measurements of a spiral
antenna with a radius of 30 mm. The turn distance is 4.5
mm for the antennas measured in Fig. 6a. It is seen that
the VSWR performance is limited in the lower
frequency-band, due to the limited radius. The limitation
was slightly less than expected from the simulations
shown in Fig. 4a but the VSWR response was less flat.
The antenna with the largest turn distance has a more
stable gain performance in the measured spectrum, see
Fig. 6b.
3
2
3
6
7
8
Frequency (GHz)
(b) Measured VSWR, r=75 mm, s=5.5 mm.
10

68
Antenna 1.3
Antenna 1.1
5
VSWR
VSWR
2
3
6
7
8
Frequency (GHz)
3
2
10
(c) Measured VSWR, r=75 mm, s=6.5 mm.
4.5 GHz
6
7
8
Frequency (GHz)
(a) VSWR measurement, r=75 mm.
3.5 GHz
(dB)
-5
Antenna 1.1
5
-10
6.5 GHz
-20
-25
-60
-40
VSWR
-15
5.5 GHz
0
20
Theta ()
(d) Measured E-field characteristics.
-20
40
60
4
3
2
s=5.5 mm
-2
6
7
8
Frequency (GHz)
10
(b) VSWR measurement, r=75 mm.
s=4.5 mm
-4
3.5 GHz
-5
-6
(dB)
Gain (dBi)
10
s=6.5 mm
5
Frequency (GHz)
(e) Maximum antenna gain.
Fig. 8.
VSWR and antenna gain measurements of three spiral
antennas with a radius of 75 mm: (a) turn distance 4.5 mm, (b) turn
distance 5.5 mm, (c) turn distance 6.5 mm, (d) measured radiation
characteristics of the antenna with 4.5-mm turn distance, =90, and (e)
-10
-15
4.5 GHz
-20
-25
-60
-40
0
20
Theta ()
(c) Measured E-field characteristics.
-20
40
60
Gain (dBi)
D. Air Core Spiral Antennas

Fig. 9 shows measurements of two air core spiral
antennas (see Fig. 2b) with r=75 mm. It is seen in Figs.
9a and 9b that the substrate with an air gap gives a more
stable VSWR response compared to the antennas in Figs
8a-8c, and the VSWR is kept low for a wide frequency
range. It is shown in Fig. 9a that with an air core
substrate (1.5 mm RO4350B + 1.5 mm air), a VSWR<2
can be achieved for the entire UWB frequency band.
Besides low VSWR the air core substrate provides good
impedance stability, which results in an almost resonance
free VSWR response. Figs 9d-9e show a much higher
antenna gain compared to the equally sized antennas
show in Fig. 8. This is due to a high height between the
antenna and the ground plane and low loss tangent of the
air layer.
5.5 GHz 6.5 GHz
=0
4
2
0
=90
-2
=45
5
Frequency (GHz)
(d) Measured antenna gain, 2.2 mm air.

69
air gap affects positively the antenna performance in two

ways. Firstly, the input impedance is improved.
Secondly, an air gap increases the maximum gain. The
increase in gain is similar to what is reported with
electromagnetic bandgap structures (EBG) [13, 14].
However, the radiation pattern of the air core monofilarspiral antenna is not as smooth as for instance what is
reported for the EBG curl antenna [14].
Gain (dBi)
8
2.2 mm air
6
4
2
1.5 mm air
3
5
Frequency (GHz)
(e) Maximum antenna gain.
Fig. 9.
Air core antenna measurements, r=75 mm: (a) VSWR, 1.5
mm RO4350B core + 1.5 mm air, (b) VSWR, 0.8 mm RO4350B core +
2.2 mm air, (c) measured E-field of the 1.5 mm air core antenna, =90,
(d) measured antenna gain, 0.8 mm RO4350B + 2.2 mm air, and (e)
V. DISCUSSIONS
A study of monofilar spiral antennas for UWB is
presented in this paper. Antennas covering the entire
UWB frequency band of 3.1-10.6 GHz can be realized
with various SWR numbers, but the physical size limits
the fractional bandwidth and determines the centre
frequency. The turn distance was found to be a crucial
factor for the antenna gain, and the optimal turn distance
varies with the radius of the antenna. However,
increasing the turn distance decreases the fractional
bandwidth so a trade-off must be made. Some
missmatching between simulated and measured results is
seen, especially regarding to fractional bandwidth and the
center frequency, see Fig. 4. The difference may come
from the fact that the simulations are done with an
infinite ground-plane while the prototype ground-plane is
only 30 mm larger than the spiral. Moreover the
simulations are done with a more limited number of
points. Momentum is as mentioned not a full 3D
simulator. More advanced EM simulators like Ansoft
HFSS or CST Microwave studio may give more accurate
results.
A single arm spiral antenna can have good
performance over a wide frequency range. However, the
radiation pattern is dependent of the frequency so that the
performance in a certain direction varies throughout the
frequency-range (see Figs. 8d and 9c). The performance
varies owing to the fact that the beam moves around the
z-axis (see Fig 3.) such that several modes interact.
Furthermore, the high frequency ripple in VSWR affects
performance and causes the antenna gain to vary
throughout the frequency-range. These variations are due
to the physical properties of the single arm spiral that
introduces non-real input impedance. Moreover, some
inductance might be introduced by the SMA connector,
i.e., the feeding via through the PCB is not impedance
controlled. A more uniform and high gain can be
achieved when an air core is used (see Fig. 9). This is due
to the fact that with a high gain less signal power reaches
the arm end, i.e., the backward travelling wave gets
suppressed relative to the forward travelling wave.
Furthermore, the air core effectively reduces the high
frequency variation in VSWR. In general an additional
VI. CONCLUSIONS
A study of monofilar spiral antenna for UWB has been
conducted. How the spiral radius, turn distance, and
substrate dissipation factor and thickness affect the
antenna gain were shown. Moreover, antenna gain
improvement with an air gap between the ground plane
and the antenna plane has been shown.
Monofilar spiral antenna solutions for UWB optimized
for an input source impedance of 50 was designed,
simulated and measured. It is shown that a planar
monofilar spiral antenna implemented on a PCB is
suitable for RF module designs.
Monofilar spiral antennas with a radius of 30, 50 and
75 mm were designed, simulated and measured. One
monofilar spiral antenna of these sizes covers the entire
UWB frequency band (3.1-10.6 GHz) at a VSWR<4, <3,
<2, respectively.
Monofilar spiral antennas with a radius of 75 mm
consisting of an air core were also designed, simulated
and measured. One monofilar spiral antenna with the air
core covers the entire UWB frequency band at
VSWR<1.8. The VSWR response is also more stable.
The antenna gain is higher than that from any of the
monofilar antennas without an air core.
REFERENCES
[1]
R. G. Corzine and I. A. Mosko, Four-arm spiral antennas, Wood

MA, Anech House, 1990.
[2] R. H. DuHamel and J. P. Scherer, Frequency-independent
antennas," Antenna Engineering Handbook, 3rd ed., Johnson R.
C. Ed., McGraw Hill, New York, 1993, Ch. 14, pp. 53-82.
[3] M. Karlsson and S. Gong, An integrated spiral antenna system
for UWB, Proc. IEEE 35th European Microwave Conf., Paris,
France, Oct. 2005, pp 2007-2010.
[4] E. Gschwendtner, D. Lffler, W. Wiesbeck, Spiral antenna with
external feeding for planar applications, IEEE Africon, vol. 2,
Sep. 1999, pp. 1011-1014.
[5] C. Kinezos and V. Ungvichian, Ultra-wideband circular polarized
microstrip archimedean spiral antenna loaded with chip-resistor,
IEEE Antennas and Propagation Society International
Symposium, vol 3, Jun. 2003, pp. 612-615.
[6] M. Karlsson and S. Gong, Wideband patch antenna array for
multi-band UWB, Proc. IEEE 11th Symp. on Communications
and Vehicular Tech., Ghent, Belgium, Nov. 2004.
[7] G. R. Aiello and G. D. Rogerson, Ultra Wideband Wireless
Systems, IEEE Microwave Magazine, vol. 4, no. 2, pp. 36-47,
Jun. 2003.
[8] S. Gong, M. Karlsson, and A. Serban, Design of a Radio Front
End at 5 GHz, Proc. IEEE 6th Circuits and Systems Symp. on
Emerging Tech., Shanghai, China, Jun. 2004, vol. 1, pp. 241-244.
[9] D. Guha and J. Y. Siddiqui, Resonant frequency of equilateral
triangular microstrip antenna with and without air gap, IEEE
Trans. on Antennas and Propagation, vol. 52, no. 8, Aug. 2004.
[10] H. Nakano, Y. Okabe, H. Mimaki, J. Yamauchi, A Monofilar
Spiral Antenna Excited Through a Helical Wire, IEEE Trans. on
Antennas and Propagation, vol. 51, no. 3, Mar. 2003.
70

[11] H. Nakano, J. Eto, Y. Okabe, J. Yamauchi, Tilted- and AxialBeam Formation by a Single-Arm Rectangular Spiral Antenna
With Compact Dielectric Substrate and Conducting Plane, IEEE
Trans. on Antennas and Propagation, vol. 50, no. 1, Jan. 2002.
[12] A. P. Jenkins, A. M. Street, D. Abbott, Filter design using CAD.
II. 2.5-D simulation, Effective Microwave CAD, IEE
Colloquium, no. 1997/377, pp. B1/1-B1/5, Dec. 1997.
[13] P. de Maagt, R. Gonzalo, Y.C. Vardaxoglou, J.-M. Baracco,
"Electromagnetic bandgap antennas and components for
microwave and (Sub)millimeter wave applications," IEEE Trans.
on Antennas and Propagation, vol.51, no.10, pp. 2667- 2677, Oct.
2003.
[14] J.-M. Baracco, M. Paquay, P. de Maagt, "An electromagnetic
bandgap curl antenna for phased array applications," IEEE Trans.
on Antennas and Propagation, vol.53, no.1, pp. 173- 180, Jan.
2005.
1977. He received the M.Sc. and Licentiate of
Communication Electronics research group at
Linkping University.
His main work involves
wideband antenna-techniques and wireless communications.
He received the B.Sc. degree from Fudan University in
Shanghai in 1982, and the Licentiate of Engineering
and Ph.D. degree from Linkping University in
Sweden, in 1988 and 1990, respectively.
the microelectronic institute Acreo in Sweden. From
2000 to 2001 he was the CTO at a spin-off company from the institute. Since
2002 he has been full professor in communication electronics at Linkping
University, Sweden. His main research interest has been communication
electronics including RF design, wireless communications and high-speed
data transmissions.
Regular Paper
71

Chedjou J.C. et al.: Performance Evaluation of Analog Systems Simulation Methods
for the Analysis of Nonlinear and Chaotic Modules in Communications
Performance Evaluation of Analog Systems

Simulation Methods for the Analysis of Nonlinear
and Chaotic Modules in Communications
*
J. C. Chedjou , K. Kyamakya
Institute for Smart-Systems
Technologies,
University of Klagenfurt
Klagenfurt, Austria
e-mails: Jean.Chedjou@uni-klu.ac.at
, Kyandoghere.Kyamakya@uniklu.ac.at
Abstract
Van Duc Nguyen
Ildoko Moussa, J. Kengne
Faculty of Electronics and

Telecommunications,
C9-P403 Hanoi
University of Technology
e-mail:
van_duc_nguyen@yahoo.com
Doctoral School of Electronics,

Information Technology, and
Experimental Mechanics
(UDETIME),
University of Dschang
Dschang, Cameroon
e-mail: kengnemozart@yahoo.fr
The revolutionary idea of setting Analog Cellular

Computers based on Cellular Neural Networks systems (CNNs)
to change the way analog signals are processed is a proof of the
high importance devoted to the analog simulation methods. This
paper provides basics of the methods that can be exploited for the
analog simulation of very complex systems (an implementation
on chip using CNN technology is possible even on FPGA). We
evaluate the performance of analog systems simulation methods.
These methods are applied for the investigation of the nonlinear
and chaotic dynamics in some modules of communication
systems. We list some problems encountered when using this
approach and propose appropriate techniques to tackle them.
The overall motivation is to encourage the research community
to use analog methods for systems analysis, despite the strong
focus on the numerical approaches that kept analog simulation
alternatives a bit in the dark during the past decades. Both
advantages and limitations of the analog modelling schemes are
discussed versus those of their numerical counterpart. To
illustrate the concepts, a communication module consisting of a
shunt type Colpitts oscillator is considered. The electrical
structure of the oscillator is addressed and the modeling process
is performed to derive the equations of motion. The numerical
analysis is carried out to obtain various bifurcation diagrams
showing scenarios leading to chaos. Both PSPICE based
simulations and laboratory experimental realizations (with
analog circuits) are considered to validate the modeling and to
confirm the numerical results. Near-sinusoidal oscillations, subharmonics and chaos are observed. The bifurcation study reveals
that the system moves from near-sinusoidal regime to chaos via
the usual paths of period doubling and sudden transitions. One of
the interests of this work (amongst many others) is to prove that
the analog systems simulation approach is more suitable than its
numerical counterpart for the analysis of the striking and
complex dynamics of non-linear parts/module of communication
systems.
Keywords- Communication Systems; Shunt Colpitts

Oscillator; Bifurcation; Chaos; Analog Systems Simulation Methods
I.
INTRODUCTION
The last decade has witnessed a tremendous attention on

the effects of nonlinearity in sinusoidal oscillators [1-23]. The
interest devoted to these effects is explained by the rich and
complex behaviour the oscillators can exhibited in their
nonlinear states and also by the various technological and
fundamental applications of such oscillators. Indeed, in their
nonlinear states these oscillators can be exploited in many
applications such as measurement, instrumentation and
telecommunications. In their regular states, the oscillators can
be used for instrumentation and measurements while the
chaotic behaviour (irregular state) exhibited by the oscillators
can be used in chaotic secure communication [24] just to name
a few.
Concerning either shunt [1-6] or non-shunt [7, 8] structures
of the Colpitts oscillators, some interesting works have been
carried out. Reference [1] does consider an analytical approach
based on asymptotic method to analyse the dynamics of the
Colpitts oscillator. Bifurcation scenarios are obtained
numerically to confirm the richness of the modes exhibited by
the Colpitts oscillator. The extreme sensitivity of this oscillator
to tiny changes in its parameters is shown. Reference [2] deals
with the observation of chaos in the Colpitts oscillator. A
model (set of equations) describing the autonomous states of
the oscillator is proposed. A piecewise-linear circuit model is
considered. Chaotic behaviour is observed numerically and
experimentally. The authors of reference [3] focussed on the
relationship between chaotic Colpitts oscillator and Chuas
circuit. They showed that the Colpitts oscillator might be
72

mapped to the Chuas oscillator with an asymmetric

nonlinearity. Some bifurcation structures of the oscillators are
obtained and the structure under consideration exhibits striking
phenomena amongst which period doubling scenarios to chaos
are observed. Reference [4] develops a methodological
approach to the analysis and design of a Colpitts oscillator. A
nonlinear approach for the two quasi sinusoidal and chaotic
operating modes was considered. In particular, the generation
technique of regular and irregular (chaotic) oscillations in
terms of the circuit parameter was shown. Reference [5]
considers non-smooth bifurcations in piecewise-linear model of
the Colpitts oscillator. An approximate 1D map was proposed
for predicting border collision bifurcation (common in power
electronics) of the Colpitts oscillators. In reference [6] one
considers the modelling of chaotic generators using microwave
transistors. The transition from a simplified mathematical
model to a model of RF chaotic source or microwave band is
discussed. Reference [7] exploits a Lure system form to clarify
the occurrence of chaos in the Colpitts oscillator. Components
of the autonomous chaotic Colpitts oscillator causing the
variation of equilibrium points are identified. This study was
extended by the same authors to the case of the forced (nonautonomous) Colpitts oscillator. Both amplitude and frequency
of the external excitation were used for chaos control in the
oscillator. The dynamic maps of locking, transition, and normal
areas, with their related frequencies and output powers, were
depicted by measurements. Simulation and experimental
results of injection-locked behaviour were discussed and
presented.
The works summarised above show the Colpitts oscillator
(either a shunt or non-shunt structure) as a chaotic generator. It
appears that the non shunt structure of the Colpitts oscillator
has been intensively considered while the literature is very poor
concerning information related to the shunt structure of such an
oscillator. It has been shown that tiny changes (imperfection or
instability in the shunt structure of the oscillator) in the
parameters values can generate the chaotic behaviour of the
oscillator. One of the advantages of the shunt Colpitts oscillator
(amongst many others) can be found in practical realisation.
Indeed, this structure is simple to be realised. Moreover, the
good stability of the fundamental characteristics (that is the
amplitude and phase) of the waveforms generated by such a
structure is due to the fact that the biasing current is not
flowing in the oscillatory network as observed in the non-shunt
structure. These are some advantages of the shunt structure of
the Colpitts oscillator.
This paper considers the shunt structure of the Colpitts
oscillator. To the best of our knowledge approximate analytical
results available in the literature concerning such a structure are
obtained from analysis tools based on Lure system forms.
These models were used to study the stability conditions of
regular oscillations and the possible appearance of chaos in the
shunt structure of the Colpitts oscillator. Nevertheless the
literature does not propose a direct modelling of the shunt
structure of the Colpitts and does not analyse the chaoticity
(degree of chaos) of the oscillator from the real model of the
oscillator. Our aim in this paper is to list some difficulties that

can be faced when performing analog experimental simulations
and propose appropriated methods to tackle them. We also
contribute to the general understanding of the behaviour of the
shunt structure of Colpitts oscillator and complete the results
obtained so far by a) carrying out a systematic and methodical
analysis of its nonlinear dynamics; b) providing both
theoretical and experimental (analog) tools, which will be of
precious use for design and control engineers since they can be
used to get full insight of the nonlinear dynamical behaviour of
the oscillator; c) pointing out some of the unknown and striking
behaviour the shunt Colpitts oscillator.
One of the traditional properties of sinusoidal oscillators is
the possibility of adjusting the frequencies of the waveforms
generated starting from RC or LC resonators components.
Nevertheless such an operation becomes very delicate or
obsolete when the effects of nonlinearity are taken into
account, since a rigorous analysis shows the strong dependence
upon nonlinearity of the fundamental characteristics (that is
both amplitude and frequency) of the waveforms generated.
This can clearly be demonstrated by performing a rigorous
analysis to express the fundamental characteristics of the
waveforms generated. in terms of the parameters of the bipolar
junction transistor responsible of the nonlinearity phenomena
observed in the structural behaviour of such oscillators.
The structure of the paper is as follows. Section 2 presents
the theoretical methods versus analogue methods. An
evaluation of some difficulties currently faced when
performing each of these methods is carried out. Some
appropriate solutions are proposed to tackle these difficulties.
In section 3 we use a simpler model of the bipolar junction
transistor for modelling. The state equations of the circuit
designed corresponding to the shunt structure of the Colpitts
oscillator are obtained. The numerical simulation is carried out
and various bifurcation diagrams associated to their
corresponding graphs of largest one dimensional (1D)
Lyapunov exponent are obtained showing both complex and
striking scenarios to chaos. Section 4 exploits the Pspice
software to simulate the dynamics of the shunt structure of the
Colpitts oscillator using the trial and error approach. The
chaotic behaviour of the oscillator is observed. In section 5,
experimental measurements on a real circuit are performed to
confirm the results from both numerical and Pspice simulation
tools. Section 6 deals with conclusions and proposals for
further works.
II. EFFICIENCY OF THE THEORETICAL METHODS
VERSUS THE ANALOG METHODS: EVALUATION
AND SOME PROPOSALS
Theoretical methods Vs. Analog methods
We briefly describe and compare both analog and
numerical methods. Invariably the question arises - Which is
better, analog or numerical method? Our wish is to present
both limits and advantages of each of the methods in order to
maintain opened the answer to the above question. We present
A.
73

and prescribe some practical advises when dealing with analog

implementation techniques. Our aim is to encourage engineers
to use these techniques for the analysis of nonlinear models
despite some practical difficulties faced (when performing
these techniques) such as saturation and offset phenomena of
the discrete components (diodes, transistors, operational
amplifiers, and multipliers) of the electronic circuits. In
addition to these difficulties is the dependence of the accuracy
of the analog techniques upon the precision and stability of the
electronic components. Hopefully, by proposing some
techniques to tackle the difficulties encountered during analog
implementations this will encourage interested researchers to
use analog systems simulation techniques for the analysis of
nonlinear problems.
Theoretical approaches (analytical and numerical methods)
are commonly used to investigate the dynamics of nonlinear
systems. However, the problems faced when performing
numerical simulation methods are well-known: a) lack of
method to choose the appropriate numerical integration step
size, b) lack of method to determine the duration of the
transient phase of a numerical simulation, and c) numerical
simulation of complex dynamical systems is very time
consuming while compared to its analog counterparts that are
very fast [24]. Though the analog implementation is always
limited by the saturation and offset phenomena of analogue
devices such as operational amplifiers (LM741 and LF351) and
multipliers (AD-633JN), it does however offers good ways to
tackle the above difficulties faced by the numerical analysis.
Analytical methods can provide only approximate solutions of
nonlinear dynamical models while analog methods give exact
solutions. These are some major reasons for the increasing
interest devoted to this type of simulation for the analysis of
nonlinear and chaotic physical systems [24 28]. In fact, a
properly designed circuit can provide sufficiently good realtime results faster than a numerical simulation on a fast
computer [24]. Such a circuit must use high precision resistors
and capacitors. In addition, the offset voltage of the operational
amplifiers and multipliers must be well controlled.
The analog techniques do not take into account the
notion of algorithm and there is no need to translate
quantities into appropriate symbolic forms. For these
techniques, variables are represented by physical quantities on
which the operations are performed. The simulation is carried
out by some physical systems that obey the same mathematical
relations that control the physical or technical phenomenon
under investigation [29]. This procedure is in some sense more
natural to both physicists and engineers [30]. A virtue of the
analog techniques is that their basic design concepts are usually
easy to recognize. What goes on inside is understandable since
it is an analog of the real system whereas the numerical type
simulator is a product of pure logic. It cannot be described as
similar to something with which we are familiar [31, 32].
Therefore, although the numerical analysis had long ago
superseded the analog techniques due, to a large extent, to the
spectacular development of digital technology, we still believe
that analog techniques might bring some fresh air to the theory
of computation in the field of non linear dynamics.
Practical problems and advices
Concerning the effects of nonlinearity in an unspecified
sinusoidal oscillator, we mention that they come mainly from
the discrete components (bipolar junction transistor (BJT),
operational amplifier (Opamp), or analog circuits multipliers)
constituting the gain elements. Thus, the effects of nonlinearity
will be differently perceived depending upon the type of gain
element. Some recent scientific contributions [1-4] have
presented analytical approaches to explain the nonlinear
behavior exhibited by classical oscillators using bipolar
junction transistors. The exponential dependence of the current
flowing through the collector of the BJT with respect to the
voltage drop between its base-emitter regions was presented as
the origin of nonlinearity in the oscillators. Bifurcation
structures were presented, showing the coexistence between the
regular states (limit cycles) and the irregular states (chaotic) of
the oscillators.
B.
The difficulties faced by theoretical (analytical and

numerical) analyses methods are avoided when performing
analog implementation techniques. Nevertheless various
practical problems are currently encountered during the
implementation of analog circuits. Among these problems are
some that automatically induce errors in analog calculations.
The development below lists some practical problems
encountered and proposes solutions to tackle each of them.
Offset phenomenon:
This phenomenon is the presence of a static voltage at

inputs of analog devices (such as operational amplifiers
(UA741), circuits multipliers (AD633JN), ...) when they are
biased by a DC voltage source. The offset phenomenon that
occurs at outputs of the analog devices usually[39]. We have
demonstrated the cancellation technique of this phenomenon in
Ref. [24]. Indeed, such phenomenon can be compensated by
using a compensation array that consists of monitoring a
precise potentiometer to reduce the effect of the phenomenon
on the dynamical behavior of the analog devices [24, 39]. The
steps to perform offset cancellation are threefold. We use a
potentiometer (P) having three points amongst which the
middle point is movable between the two others that are fix
points. We connect the fix points of (P) to the pin numbers 1
and 5 of the Opamp (for example: UA741 or LF 351). The
second step is the connection of the movable point of (P) to a
DC voltage source (biasing for instance). By monitoring a
potentiometer (P), we measure the evolution of the DC voltage
both at inputs and output of the Opamp. The situation where
the movable point is very close to the pin numbers 1 and 5
should be avoided. This can lead to the destruction of the
device (Opamp) due to a simultaneous presence of the entire
value of the bias both at pins 1 and 5. The potentiometer is
monitored to transform the magnitude of the input voltages of
Opamps into almost the same order. The offset cancellation
becomes very complex when the electronic circuit is of a self-
74

sustained type. In this case, the voltages at inputs of the analog

device can be a direct consequence of the self-sustained
character of the circuit. When the value of the self-sustained
voltage at inputs of analog devices can be predicted (by a
circuit theory analysis or by taking into account the expected
performances of the circuit that may be defined before the
realization of the circuit) the offset cancellation method can be
used to fix the predicted values.
Saturation phenomenon:
The dynamics of analog circuits is limited by the value(s)

of the DC voltage(s) source(s) used for biasing. The saturation
phenomenon occurs when a signal of magnitude greater than
the value(s) of the DC voltage(s) source(s) used for biasing is
found at a given point in the electronic circuit. To overcome
the saturation problem, the scaling factor process is applied by
rescaling the state voltages at different points of the electronic
circuit in order to fit within the biasing range. We use a Static
Check to verify if the system has been wired correctly. By
tracing through the system we can calculate what the output
voltage of each component should be. If it is determined that
all outputs are of correct magnitude and sign (when measuring
them), it can be safely assumed that the system is wired
correctly.
Power transfer:
When the electric current is flowing from one electronic

network (transmitter) to another (receiver), the power is
transferred in the same direction. A problem may arise where
the power is not transferred. This can be explained by the fact
that the dynamical resistor at the output of the transmitter is not
of the order of that at the input of the receiver. Such a problem
can be solved by adapting the total dynamical resistor
(impedance) between the two electronic networks. This is
generally achieved by adding, in parallel, a dynamical resistor
at the output of the first electronic network (transmitter) or at
the input of the second electronic network (receiver). Also, the
power may not be transferred because the connection between
the transmitter and the receiver is open. This problem can be
detected by measuring the voltage at each point of the analog
circuit.
Defective electronic devices:
Damage of analog devices is generally caused by their

wrong supply (biasing) or by a complete involuntary shunt
performed between their inputs. When defective, the
temperature within the analog devices may be very high. This
may be a usual checking for some analog devices such as
Opamps and analog multipliers AD633JN. Modules for a direct
test of defective components are available. Concerning
Opamps, they can be tested as voltage device followers.
Analog circuit multipliers can also be tested by loading their
inputs by well-known electrical signals. Nevertheless, a
situation may occur where despite the fact that the analog
devices (Opamps and analog multipliers) are not defective
there is no signal at their outputs or the signals at outputs are
not those expected. This is a classical problem related to the
bandwidth of the analog devices used that may be controlled

when performing analog experimentation.
Time scaling:
It is well-known that the state accuracy of electronic

circuits depends on the accuracy of their electronic components
(Opamps, analog multipliers, resistors, capacitors, ) [24, 39].
Yet, the dynamics of electronic circuits is limited by the
frequency bandwidth of the analog devices (Opamps,
multipliers,). When an analog device operates within a
range of frequency not included in its bandwidth, this affects
the behavior of the electronic circuit containing this device and,
consequently the results obtain are not correct. The time
scaling process offers to the analog devices (e.g. Opamps,
Circuit multipliers, ) the possibility to operate under their
bandwidth. This process is currently used to restrict the high
frequencies into low frequencies and inversely, this depending
upon the frequency bandwidth of the analog devices in order to
expect their good functioning. The time scaling process is also
of high importance while performing analog simulation. It
offers the possibility to simulate the behavior of the system at
very high frequencies by performing an appropriate time
scaling that consists of expressing the real time variable t
versus the analog simulation time variable
(e.g. t = 10 ) , allowing the simulation frequency to be

a
10 + a times less than the real frequency. Here, a is positive

integer depending on the values of the resistors and capacitors
used in the analog simulator. One of the advantages of time
scaling amongst many others is the possibility it offers to the
integrators to manage both high and low frequencies signals.
Time scaling also allow the simulation of either high frequency
or large broadband phenomena using analog devices (Opams,
analog multipliers,) that operate in a restricted frequency
bandwidth [39].
III.
IMPORTANT CONTRIBUTION
Some interesting proposals were presented to tackle

problems encountered by numerical simulation. Concerning the
problem due to the integration discontinuities related to the
choice of the numerical integration step size, Thomas RbnerPetersen [33] proposed an efficient algorithm using backward
time-scale differences for solving stiff differential-algebraic
systems. The proposed approach has computational advantages
in simplicity and flexibility with respect to variations of the
integration order. In fact, this algorithm allows the order within
each step to be changed in an optimal way between k and
k + 1 . The implementation of the algorithm is described as
part of a nonlinear analysis program, which has proved to be
quite efficient for simulations of electronic networks. This
program provides parameters in the DC analysis mode to be
varied with automatic control of the step size. We have found
that though the proposed method is very interesting in solving
the numerical convergence problem since it varies
automatically the step size to obtain an appropriate converging
one, it requires very long integration time. Moreover, the
75

integration duration may become even much larger because it

increases with increasing nonlinearity in the system under
investigation.
The community providing usable technical solutions for
computer based design (or CAD) has proposed the possibility
of using GEAR algorithm [34] in Spice to overcome the
divergence problem due to an inappropriate choice of the
integration step size [35]. The proposed method, though quite
interesting, is limited by the fact that the simulation using Spice
is still a theoretical analysis because the characteristics (or the
internal parameters) of the analog components (Diodes,
Transistors, Operational Amplifiers, and Multipliers) are
chosen to be ideal (that is are transportable to real compnents
which are generally far from being ideal). In addition to this,
Spice is emulation and the calculations it performs are done
through algorithmic processes on a computing platform of the
Von Neuman type.
Mention that Pspice and Simulink are calculation tools that
are currently used for analog analysis rather than a real
physical implementation (see the subsection below). These
simulation tools are purely theoretical and still rely on some
form of numerical computation in the background. Furter, the
analog components they use are generally considered in the
states where their characters are ideal. This would have been an
advantage since the results obtained are of very good accuracy.
Unfortunately the simulation of complex dynamical systems
using these simulation tools is very time consuming (due to the
still numerical computations in the background). Nevertheless,
it is clear and sufficiently convincing that analog systems
simulation (eitheir analog simulation of the circuits or the
direct implementation of the circuits) is more suitable than its
other counterparts for the analysis of complex nonlinear
phenomena. It is a very precious tool for reliably detecting
some strange phenomena such as chaos, modulation,
demodulation and also synchronization, to name a few.
IV.
SAMPLE RESULTS TO ILLUSTRATE THE

CONCEPTS
We consider the shunt type structure of the Colpitts

oscillator. The interest devoted to this oscillator is its
possibility to behave chaotically both at low and high
frequencies. The stability of the shunt type oscillator is also of
high importance since it allows an efficient exploitation of
such a structure in instrumentation, measurement and
telecommunication. Our aim is to propose a shunt type
structure of the Colpitts oscillator of practical interest to enrich
the literature concerning nonlinear oscillators. The proposed
structure might fulfill the above requirements. We show that
the proposed structure can be realized experimentally. We also
show that analog systems simulation (that is both analog
simulation design and direct implementation with analog
electronic components) are very suitable to get full insight of
the behavior of the oscillator.
A.
Circuit description
Fig. 1 is the design of the shunt structure of the Colpitts
oscillator under investigation. The bipolar junction transistor
Q1 used in the common-base-configuration, plays the role of
nonlinear gain element. The feedback network consists of the
inductor L and the capacitors C1 and C 2 . These capacitors
act as voltage divider.
C3 is a coupling capacitor which may
be of very low impedance within the frequency bandwidth in

which the oscillator operates. The biasing is provided by the
DC voltage source VCC . I 0 is an ideal current source. The
difference between the series type and the shunt type Colpitts
oscillators is that the biasing current doesnt flow through the
feedback network in the latter type.
Figure 1. Circuit diagram of the shunt type Colpitts oscillator
The fundamental frequency of a shunt type Colpitts

oscillator can be estimated as follows:
f0=
1
2
C1 + C 2
LC 1C 2
(1)
In the structure of Fig. 1, the nonlinear device is the

bipolar junction transistor Q1 , which nonlinear character is
responsible of the chaotic behaviour exhibited by the
electronic circuit. The transistor Q2N3904 is chosen for the
investigations. The choice of this type of transistor is
motivated by the fact that the dynamical input impedance
Z input =
h11
h21 + 1
(2)
is expressed in terms of the hybrid parameters which values

give an appropriate value of the dynamical input impedance
that allows a good power transfer. Another interest on the
Q2N3904 is its availability in Pspice simulation package.
Therefore the results from Pspice can be compared with
experimental results. Three steps are considered for the
investigation of the dynamical behaviour of the shunt type
Colpitts oscillator: (1) modelling of the oscillator and analysis
of chaotic behaviour exhibited by the oscillator; (2) Pspice
simulation of the oscillator using the trial and error

76
approach; (3) direct implementation of the oscillator. These

steps allow the confirmation or validation of the results
obtained concerning the behaviour of the oscillator.
Circuit modelling and chaotic behavior
B.
Circuit modeling
The BJT operates in just two regimes namely either the

forward conducting or the non-conducting one. Therefore, for
theoretical analysis, a simplified model [1] consisting of one
current-controlled source and a single diode with exponential
characteristic is convenient. Under these assumptions, the
emitter and collector currents are defined as follows:
V BE
(3a)
I E = I S exp
1
VT
(3b)
IC = FI E
where I S is the saturation current of the base-emitter junction
and VT = 26mV is the value of the thermal voltage at room
temperature. F denotes the common-base short-circuit
forward current gain of the BJT.
We emphasize that the idea is to select the simplest
possible model which maintains the essential features [16]
exhibited by the real circuit. If we denote by I L the current
flowing through the inductor L , and
voltages across the capacitors
Vi (i = 1, 2, 3) the
C i , the Kirchhoff Current Laws
(KCL) can be applied to the circuit of Fig. 1 to obtain the

following set of differential equations describing the evolution
of the voltages Vi .within the electronic circuit:
dI L
= V1 + V2
dt
(4a)
x=
where
dV2
= VCC V1 V2 V3 +
dt
R[(1 F )I E I L I 0 ]
dV
RC3 3 = VCC V1 V2 V3 R F I E
dt
F = 1 (this is
100 F 200 )
with
dimensionless quantities:
(5b)
(5c)
(5d)
= LC 2 , =
C2
L
, 1 =
,
C1
C2
VCC
IS
C2
=
, =
,
, =
, and
C3
VT
R
VT
I0
=
, Eqs. (4) can be transformed into the following
VT
2 =
first order differential equations:
dx
= y+z
(6a)
d
dy
= 1 .[ x + ( y z ) f ( z )]
(6b)
d
dz
= .( y z ) x
(6c)
d
d
= 2 .[ .( y z ) f ( z )]
(6d)
d
where f ( z ) is the exponential function derived from Eqs. (3)
f ( z ) = ( exp( z ) 1) )
(4b)
(4c)
(6e)
The model described by Eqs. (6) is the one we propose for the
investigation of the dynamical behaviour of the structure
under consideration of the shunt type Colpitts oscillator.
Assuming
(5a)
VT
V
y= 1
VT
V
z= 2
VT
V
= 3
VT
and expressed as follows:
dV
RC1 1 = VCC V1 V2 V3 +
dt
R F I E RI L
RC 2
iL
Chaotic behavior
Eqs. (6) are solved numerically to define routes to chaos.

We use the fourth-order Runge-Kutta algorithm [36, 37] for
the sets of the parameters used in this work, the time step is
always t 0.005 and the calculations are performed using
real variables and constants in extended mode. The integration
time is always T 10 . Here, the types of motion are
identified using two indicators. The first indicator is the
bifurcation diagram, the second being the largest 1D
numerical Lyapunov exponent denoted by
6
(4d)
well justified for most transistors

and introducing the following
ln[d (t )]
t
t
max = lim
where
(7a)

77
d (t ) =
(x )2 + (y )2 + (z )2 + ( )2
(7b)
and computed from the variational equations obtained by
perturbing the solutions of Eqs. (6) as follows: x x + x ,
y y + y , z z + z , and + . d (t ) is
the distance between neighbouring trajectories [38].
Asymptotically, d (t ) = exp max .t . Thus, if max > 0 ,
several kinds of periodic and multi-periodic windows. Fig. 2

provides some sample results, showing the bifurcation
diagram C1 ( nF ), i L ( mA) when 5F C1 15F .
Period-doubling routes to chaos are shown both with
increasing and decreasing C1 . Also shown is a period 2
sudden transition route to chaos.
neighbouring trajectories diverge and the state of the oscillator

is chaotic. max < 0 , these trajectories converge and the state
of the oscillator in non-chaotic.
max = 0
for the torus state
of the oscillator [38].

Setting the values of the components of Fig. 1
( VCC = 12V , VT = 26mV , C 2 = 100nF , C3 = 22F ,
L = 470F , I S = 6.734 fA , I 0 = 5mA , and

R = 318 ) we analyze the effects of the capacitor C1 (or
the parameter 1 ) on the behaviour of the oscillator.
Therefore, a scanning process is performed to investigate the
sensitivity of the oscillator to tiny changes in C1 ( 1 ). The
investigations are carried out in the following windows:
5F C1 15F , and 15F C1 25F .
Figure 3. a)- Bifurcation diagram of the current
iL
flowing through the
L in terms of the feedback divider capacitor C1

15F C1 25F
inductor
for
Various tiny windows of chaotic states of the oscillator are

shown alternating with windows of regular motion. The weak
chaoticity (degree of chaos) of the oscillator is shown. This is
clearly demonstrated by the small values of the largest 1D
numerical Lyapunov exponent that were always less than
0.0185 for 5F C1 15F . Fig. 3 shows the bifurcation
Figure 2. a)- Bifurcation diagram of the current
iL
flowing through the
L in terms of the feedback divider capacitor C1

5F C1 15F
inductor
for
Considering the effects of the capacitor C1 , it appears

that the structure of the oscillator in Fig. 1 leads to complex
dynamical behaviour, such as torus, multi-periodic, quasiperiodic, and chaotic states. We observe various routes to
chaos (such as sudden transition, period adding, perioddoubling, torus breakdown, or quasi-periodic routes) with
diagram C1 ( nF ), i L ( mA) for 15F C1 25F . A

period-adding scenario to chaos (period 5period 7chaos)
is shown. Also shown is the period 4 sudden transition route to
chaos that occurs in a tiny window of C1 found between
17.5nF and 18nF. The weak chaoticity of the oscillatior is also
shown.
We have drown in Fig. 4 some phase portraits of the
current i L flowing in the inductor L for sample values C1 .
This figure confirms the period-doubling scenario to chaos
shown by the bifurcation diagrams. The following transition:
period 1 period 2period 4perid 8chaos is clearly
shown. The attractor of period-8 is not shown because of its
high instability due to the tiny window within which it coexits both with period 4 and chaotic attractors situated
respectively at left and at right of the value C1 = 9nF as
clearly shown in Fig. 2a.

78
Figure 5. Phase portraits of

Figure 4. Numerical phase portraits of
i L : a)- Period-1 or limit cycle,
(C1 = 4.7nF) , b)- Period-2 (C1 = 8.0nF ) , c)- period-4

(C1 = 8.5nF ) , and d)- chaos (C1 = 18.0nF )
Different routes to chaos observed in the shunt structure

of the Colpitts oscillator are commonly observed in nonlinear
systems, such as forced systems, coupled autonomous
systems, and coupled forced systems [38], to name a few. This
serves to justify the richness of the bifurcations in the shunt
Colpitts oscillator and also the striking phenomena exhibited
by such oscillators.
The model proposed for the shunt colpitts oscillator has
been computed numerically to have full insight of the
behaviour of the oscillator. The simulation on Pspice is
performed to verify the numerical results obtained and also to
validate the proposed model for the shunt type Colpitts
oscillator.
C.
Pspice simulation of the oscillator

We use the same model of the bipolar junction transistor
defined in the preceding section namely the Q2N3904. The
circuit of Fig. 1 is implemented in Pspice. Here, the trial and
error approach [12-13] is substantially exploited. The
following values of the circuit components are defined in
Pspice to obtain some phase portraits showing the evolution of
the typical phase-space trajectories of the current i L flowing
VCC = 12V , VT = 26mV ,

C 2 = 100nF , C3 = 22F , L = 470F , F = 210 ,
I 0 = 5mA , and R = 600 . The numerical phase portraits
through
the
inductor:
iL
in Pspice: a)- Period-1 or limit cycle,
(C1 = 4.7nF )
(C1 = 22.0nF ) , c)- period-4
(C1 = 25nF ) , and d)- chaos (C1 = 47nF )
, b)- Period-2
shown in Fig. 5 are obtained from the Pspice simulation.

These phases portraits are qualitatively similar to those
obtained numerically. Moreover the sequence of bifurcation
shown numerically (period 1 period 2period 4perid
8chaos) is confirmed by the Pspice simulation. The results
from Psipce simulation were generally in very good agreement
with those from the numerical analysis despite the divergence
observe in the values of the bifurcation points (values C1 ).
This divergence can be explained by the (real) characteristics
of the bipolar transistor used for/in Pspice simulation. It is
well-known that the characteristics of the transistor used (that
is the Q2N3904) are predefined and stored in Pspice
simulation package some of them being considered as ideal.
Moreover, in order to understand the operation mode of the
BJT in the oscillator, further simulations were performed
using two models of the BJT: a)- the Ebers-Moll model and
b)- the even simpler transistor model consisting of a simple
diode and a current controlled source. The results obtained in
both cases were similar to the one previously obtained using
Pspice own model for the BJT. Thus the simpler model is
adequate for investigating the essential behaviour of the
system. This makes it possible later to adopt relatively simple
state equations to describe the oscillator. The divergence
between both numerical and Pspice simulation results were
explained by the non-real characteristics of the BJT used. This
justifies the interest devoted to the real physical
implementation of the shunt Colpitts oscillator since this
method uses real electronic components and consequently the
characteristics of the electronic components are real.
Real physical implementation of the oscillator
According to the previous results, the shunt type Colpitts
oscillator can exhibit complex and striking bifurcation
scenarios leading to chaos, when the feedback divider
capacitor C1 is monitored. The study here is focussed on both
D.
79

design and analogue experimentation of the shunt type

Colpitts oscillator. The experimental results obtained from a
real implementation of the oscillator are compared with the
results obtain by both numerical and Pspice simulation
methods.
Figure 7. Experimental phase portraits of

Figure 6. Experimental setup for measurements on the Shunt Colpitts
oscillator.
Figure 6 is the proposed experimental setup for

measurements on the shunt type Colpitts oscillator. This
circuit is built on a breadboard. Fig. 6 shows the basic scheme
of the shunt type Colpitts oscillator shown in Fig.1 with the
following values of the circuit components: R = 600 ,
C 2 = 100nF , C3 = 22F ,
L = 470F , and
VCC = 12V . The network consisting of the operational

amplifier U1 with related resistors is an implementation of the
ideal current generator. If the following condition is fulfilled:
R1
R3
=
R 2 R 4 + R5
The current I 0 pulled from the load is given by :
I0 =
where
R2
Vi
R1R 5
(8a)
(8b)
Vi is the output voltage of the network using the
operational amplifier U2 with related resistors which electronic

function is an inverting amplifier. Therefore, with the values
of the components in Fig. 6, the relationship between the
control voltage Vi and the current I 0 is :
I0 =
Vi
1000
Thus,
(8c)
I 0 is supposed to vary between 0 and 12mA as

R10 varies between 0 and 100 k since the inverting input of
U2 is connected to -12V.
C1
C1
C1
C1
i L : a)- Period-1 or limit cycle,
= 4.7nF ( X : 5mA / div., and Y : 1V / div) , b)- Period-2

= 22nF ( X : 5mA / div., and Y : 2V / div) , c)- period-4
= 25.3nF ( X : 5mA / div., and Y : 2V / div) , and d)- chaos
= 47nF ( X : 5mA / div., and Y : 1V / div)
In order to investigate how the feedback ratio affects the

dynamics of the circuit, C1 is chosen as a control parameter.
The variation of C1 is performed by connecting in parallel
standard capacitor components to obtain the desired value.
The value of the biasing current I 0 is set to 5mA (as in the
above Pspice simulations) using the resistor
R10 . A 1
resistor is added in series to the inductor L to sensor its

current iL . The experimental results are obtained by observing
as a function of time the voltage across the inductor and by
plotting phase-space trajectories ( i L , v L ) using the
oscilloscope in the XY mode.
As in the case of Pspice simulations the dynamical
behaviour of the oscillator changes substantially as C1 is
monitored. This is clearly demonstrated by the experimental
pictures in Fig. 7 showing the real behaviour of the shunt type
Colpitts oscillator proposed in this paper. As it appears in Fig.
7 the real circuit shows the same bifurcation scenarios as
observed using both numerical and Pspice simulation
methods. Figure 7 shows an evolution of iL starting from
normal near-sinusoidal oscillations to chaos via a period
doubling sequence when C1 is increased. This evolution
shows identical bifurcation scenarios (period 1 period
2period 4perid 8chaos) with those form the preceding
simulation methods. Note that pictures in Fig. 7 are very close
to the numerical phase portraits. This can be considered to
validate the proposed model (Eqs. 6) for the investigation of
the dynamical behaviour of the shunt type Colpitts oscillator.
During our experimental investigations, we have also found
80

period-adding and sudden transition scenarios to chaos

exhibited by the system. These scenarios were also reported
using both numerical and Pspice simulation methods. The
experimental results were generally very close those obtained
from these methods. A very good agreement is obtained while
comparing the experimental values of the bifurcation
parameter C1 with the values from Pspice simulation.
V.
CONCLUSION
This paper was motivated by the wish of encouraging

engineers to deal with analogue simulation. Nowadays, the
revival of this method is encouraged due to the technological
exploitation of analogue systems simulation in various fields
namely
telecommunication,
biocomputing,
traffic
management, electronics (instrumentation and measurements),
to name a few. The advantages and limits of analogue
simulation were discussed and compared with those of its
numerical counterpart. Some proposals to tackle some
problems faced during the experimental realisation were
presented. The concepts were illustrated by proposing a
structure of the shunt type Colpitts oscillator. The choice of
this type of oscillator was motivated by our wish to enrich the
literature by showing the capability of the proposed oscillator
to exhibiting very complex and striking phenomena. Three
methods were considered during our investigations: the
numerical, the Pspice simulations, and finally the real physical
implementation of the proposed oscillator. These methods
were compared to validate the results obtained. The KCL
theorem was used to derive a model describing the dynamical
behaviour of the oscillator. Taking one feedback divider
capacitor C1 as control parameter, bifurcation diagrams
associated to their corresponding graphs of largest 1D
numerical Lyapunov exponent were plotted to summarise the
scenario leading to chaos. The studies revealed that the
proposed configuration of the Colpitts oscillator can exhibit
near-sinusoidal oscillations, quasi-periodic, multi-periodic,
and chaotic oscillations. Very complex bifurcation structures
were obtained: torus, period- adding, perioddoubling, and
sudden transition scenarios to chaos. The results from different
methods were compared and a very good agreement was
observed.
An interesting question under investigation is that of finding
the relationship between the loop gain and the dynamics of the
oscillator. Another problem under consideration is that of
coupling two identical chaotic oscillators of this type and
searching for the synchronization threshold. Such an
investigation is of high importance in many application areas
such as chaotic secure communications where chaos
synchronisation is being exploited in wave coding processes.
It is also of particular interest to consider the implementation
of analog methods exploiting the CNN technology for the
analog simulation of very complex systems especially on
VLSI chip implementations (for example on FPGA). This is
particularly necessary when the number of analog nodes
(needed for simulating a given very complex system) is very

high (many orders of magnitude)).
ACKNOWLEDGMENT
J. C. Chedjou would like to acknowledge the financial
support from both the Swedish International Development
Cooperation Agency (SIDA) through the International Centre
for Theoretical Physics (ICTP), Trieste, Italy. Further, he does
express his profound gratitude to the Institute for Smartsystems Technologies (IST), Faculty of engineering, University
of Klagenfurt, Austria.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
G.A Kriegsmann, Bifurcation in classical bipolar

transistor circuits, SIAM (Studies Applied Math) Vol. 49,
pp. 390-403, 1989.
M. Kennedy Chaos in the Colpitts oscillator. IEEE
Transaction on Circuits and Systems-1, vol.41, pp.771774, 1994.
M. P. Kennedy, On the relationship between chaotic
Colpitts oscillator and Chuas circuit, IEEE Transaction
on Circuits and Systems, Vol. 42, pp.373-376, 1995.
G. M. Maggio, O. De Feo and M. P. Kennedy, Nonlinear
analysis of the Colpitts oscillator and applications to
design, IEEE Transaction on Circuits and Systems, Vol.
46, pp.1118-1130, 1999.
G.M Maggio, M.di Bernado, and M.P. Kennedy, Nonsmooth bifurcations in a piecewise-linear model of the
Colpitts oscillator, IEEE Trans. Circuits Syst., Vol. 47,
pp. 1160-1177, 2000.
A. S. Dimitriev, E. V. Efremova and A. D. Khilinsky,
Modeling microwave transistor chaos generators.
J. C. Liu, H. C. Chou, and J. H. Chou, Clarifying the
chaotic phenomenon in an oscillator by Lure system
form, Microwave and Opt Tech Letts., Vol. 22, pp. 323328, 1999.
J. C. Liu, H. C. Chou, and J. H. Chou, Non-autonomous
chaotic analysis of the Colpitts oscillator with Lurs
systems, Microwave and Opt Tech Letters/ vol. 36,
pp175-181, 2003.
M. Kennedy Chaos in the Colpitts oscillator. IEEE
Transaction on Circuits and Systems-1, Vol.. 41, pp.771774, 1994.
M. P. Kennedy, On the relationship between chaotic
Colpitts oscillator and Chuas circuit, IEEE Transaction
on Circuits and Systems, Vol. 42, pp.373-376, 1995.
G. M. Maggio, C. Kennedy, and M. P. Kennedy,
Experimental manifestations of chaos in the Colpitts
oscillator, in Proc. ISSC97, Derry, Ireland, June 1997,
pp. 235-242.
81
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]

J. Zhang, Investigation of chaos and nonlinear

dynamical behaviour in two different self-driven
oscillators, Ph.D. thesis, University of London, 2001.
J. Zhang, X. Chen and A. Davis. High frequency chaotic
oscillations in a transformer- coupled oscillator. Proc. of
NDES99, Ronne, Denmark, pp.213-216, 1999.
A. A. Andronov, A. A. Vitts, and S. E. Khaikin, Theory
of oscillations. New York, Pergammon, 1996.
C. Wegene and M. Kennedy. RF chaotic Colpitts
oscillator. Proc. Of NDES95, Dublin, Ireland, pp.255258, 1995.
Y. Hosokawa, Y. Nishio and Akio Ushida, RCTransistor chaotic circuit using phase-shift oscillators,
Proc. Int. Symp. On Nonlinear Theory and its application
(Nolta98), Vol. .2, pp. 603-606, 1998.
Nikolai F. Rulkov and Alexander R. Volkovskii,
Generation of broad-band chaos using bocking
oscillator. IEEE Transaction on Circuits and Systems-1,
Vol.48, No.6, 2001.
A. Najumas and A. Tamasevicius. Modified Wienbridge oscillator for chaos. Electronics letter, Vol. 31,
pp.335-336, 1995.
David C. Hamill, Learning about chaotic circuits with
SPICE, IEEE Transaction on Circuits and Systems, Vol.
.36, No.1, February 1993.
P. Antognettti, and G. Kuznetzov, Semiconductor
Device Modeling with SPICE, 2nd edition (Mc GrawHill,NY), 1993.
L. O. Chua, C. W. Wu, A. Huang, and G. Q. Zhong, A
universal circuit for studying and generating chaos-Part I:
Routes to chaos, IEEE Transaction on Circuits and
Systems-1, vol.40, No.10, pp.731-744, 1993.
Peter Kvarda, Identifying the deterministic chaos by
using the Lyapunov exponents, Radioengineering
Vol.10, No.2, July 2001.
YU. V. Andreyev, A. S. Dmitriev, E. V. Efremova, A. D.
Khilinsky and L. V. Kuzmin, Qualitative theory of
dynamical systems, chaos and contemporary wireless
communications International Journal of Bifurcation and
Chaos, Vol.15, No.11, pp. 3639-3651, 2005.
J. C. Chedjou, H. B. Fostsin, P. Woafo and S. Domngang,
IEEE Trans. Circuits Syst. I, Vol. 48, pp.748-757, 2001.
T. Zhou and F. Moss, Phys. Rev. A, Vol. 45, pp.53935400, 1992.
A. Azzouz, R. Duhr and M. Hasler, IEEE Trans. Circuits
Syst., Vol. 30, pp.913-914, 1983.
T. S. Parker and L. O. Chua, Proc. IEEE, Vol. 75, pp.
9821008, 1987.
I. Jonshon. Analog Computer Techniques, New York:
McGraw-Hill, 1963.
S. Puchta, IEEE Annals of the History of Computing,
Vol.18, pp.49-59, 1996.
W. S. McCulloch and W. Pitts, Bulletin of the
Mathematical Biophysics, Vol.5, pp.115-133, 1943.
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
General Electric Management Consultant Services

Division. The Next Step in Management and Aprisal of
Cybernetics, 1952.
L. Owens, IEEE Annals of the History of Computing,
Vol.18, pp.34-41, 1996.
T. Rbner-Petersen: An Efficient Algorithm Using
Backward Time-Scaled Differences For Solving Stiff
Differential-Algebraic System, Report 16/5 73, Institute
of Circuit Theory and Telecommunication, Technical
University of Denmark.
C. W. Gear, Simultaneous numerical solution of
differential equations, IEEE Trans. CT-18. No. 1, pp. 8994, January 1972.
J. Pierce, The advantages of a Front-to-Back Flow for
Windows-Based PCB Design, Cadence Design Systems,
Inc. pp. 1-6, 2002.
J. S. Vandergraft, Introduction to numerical
computation, Academic, New York, 1978
J. C. Chedjou, K. Kyamakya, I. Moussa, H. -P.
Kuchenbecker, W. Mathis, Behavior of a Self-Sustained
Electromechanical Transducer and Routes to Chaos,
Journal of Vibration and Acoustics, ASME transactions,
Vol. 128, pp. 282-293, 2006.
J. C. Chedjou, L. K. Kana, I. Moussa, K. Kyamakya, and
A. Laurent, "Dynamics of a Quasi-periodically Forced
Rayleigh Oscillator," Journal of Dynamic Systems
Measurement and Control, Transactions on the ASME,
vol. 128, pp. 600-608, 2006.
J. C. Chedjou, K. Kyamakya, W. Mathis, I. Moussa, A.
Fomethe, A. V. Fono, "Chaotic Synchronization in Ultra
Wide Band Communication and Positioning Systems,"
Journal of Vibration and Acoustics, Transactions on the
ASME (In Press, to appear in 2007).
82

Jean Chamberlain Chedjou received in 2004 his

doctorate in Electrical Engineering at the
Leibniz University of Hanover, Germany.
He has been a DAAD (Germany) scholar
and also an AUF research Fellow
(Postdoc.). From 2000 to date he has been a
Junior Associate researcher in the
Condensed Matter section of the ICTP
(Abdus Salam International Centre for
Theoretical
Physics)
Trieste,
Italy.
Currently, he is a senior researcher at the Institute for Smart
Systems Technologies of the Alpen-Adria University of
Klagenfurt in Austria. His research interests include
Electronics Circuits Engineering, Chaos Theory, Analog
Systems Simulation, Cellular Neural Networks, Nonlinear
Dynamics, Synchronization and related Applications in
Engineering. He has authored and co-authored 2 books and
more than 22 journals and conference papers.
Fellow. In 2007, he spent 2 months at the Sungkyungkwan

University, Korea, as a Research Professor. His current
research interests include Mobile Radio Communications,
especially MIMO-OFDM systems, and radio resource
management, channel coding for wireless networks.
Moussa Ildoko holds an MSc. in control and signal

processing and a Doctorate degree in
Electronics from the University of
Valenciennes
et
du
HainautCambrsis in France, respectively in
1982 and 1985.
He is currently an Associate researcher
at UDETIME (Doctorate School of
Electronics, Information Technology,
and Experimental Mechanics) at the
University of Dschang, Cameroon. Besides, he is a Senior
Lecturer at the University of Yaound 1, Cameroon. He
research interests are related to Nonlinear Dynamics, Analog
Circuits Design, Chaos based Secure Communications.
Kyandoghere Kyamakya obtained the M.S. in Electrical

Engineering in 1990 at the University of
Kinshasa. In 1999 he received his
Doctorate in Electrical Engineering at
the University of Hagen in Germany.
He then worked three years as postdoctorate researcher at the Leibniz
University of Hannover in the field of
Mobility Management in Wireless
Networks. From 2002 to 2005 he was
junior professor for Positioning Location
Based Services at Leibniz University of Hannover. Since 2005
he is full Professor for Transportation Informatics and Director
of the Institute for Smart Systems Technologies at the
University of Klagenfurt in Austria.
Van Duc Nguyen received the Bachelor and Master of

Engineering degrees in Electronics and
Communications from the Hanoi
University of Technology, Vietnam, in
1995 and 1997, respectively, and the
Doctorate degree in Communications
Engineering from the University of
Hannover, Germany in 2003.
From 1995 to 1998, he worked for the
Technical University of Hanoi as an Assistant Researcher. In
1996, he participated in the student exchange program between
the Technical University of Hanoi and the Munich University
of Applied Sciences for one term. From 1998 to 2003, he was
with the Institute of Communications Engineering, University
of Hannover, first as a DAAD scholarship holder and then as a
member of the scientific staff. From 2003 to 2004, he was
employed with Agder University College in Grimstad,
Norway, as a Postdoctoral Researcher. He was with
International University of Bremen as a Postdoctoral
Kengne Jacques He obtained the diploma of Technical

High School Teacher (DIPET II) from the Department
Electrical Engineering (ENSET/
University of Douala / Cameroon)
in 1995 and the Master of Science
(M.Sc) degree from the faculty of
Sciences/University of Dschang in
2007 both in Electronics. From
1995 up to now he has been
working as a Technical high school
teacher. He is currently an associate
researcher at UDETIME (Doctorate
School of Electronics, Information Technology, and
Experimental Mechanics) at University of Dschang. Mr.
Kengne is a doctorate student in Electrical Engineering in the
field of Non linear dynamics and its application in
Communications.
Regular Paper
83

Serban A.: A Frequency-Triplexed RF Front-End for Ultra-Wideband Systems 3.1-4.8 GHz
A Frequency-Triplexed RF Front-End for

Ultra-Wideband Systems 3.1-4.8 GHz
Adriana Serban, Magnus Karlsson, and Shaofang Gong, Member, IEEE
Abstract A multi-band and ultra-wideband (UWB) 3.1-4.8

GHz receiver front-end consisting of a fully integrated filter and
triplexer network, and a flat gain low-noise amplifier (LNA) is
presented in this paper. The front-end utilizes a microstrip
network and three combined broadside- and edge-coupled
bandpass filters to connect the three sub-bands. The LNA design
employs dual-section input and output microstrip matching
networks for wideband operation with a flat power gain and a
low noise figure. The system is fully integrated in a four-metallayer printed circuit board. The measured power gain is 10 dB
and the noise figure of the front-end is 6 dB at each center
frequency of the three sub-bands. The minimum isolation
between the sub-bands is -27 dB and the isolation between the
non-neighboring alternate sub-bands is -52 dB. The out-of-band
interferer attenuation is below -30 dB.
Index TermsBandpass filter, broadside coupled, edge
coupled, frequency multiplexing, low-noise amplifier, matching
network, triplexer, multi-band OFDM system, ultra-wideband,
UWB.
I. INTRODUCTION
he ultra-wideband (UWB) technology for short-range

communication applications in the 3.1-10.6 GHz range
has been a target for intensive research in recent years [1]-[6].
The general interest in UWB technology from academia and
industry has started in 2002 as the unlicensed UWB operation
was permitted by the Federal Communication Commission
(FCC) [1]. The FCC specifications included the spectral mask,
and the bandwidth limitations of a UWB device, but not the
type of modulation scheme or signal. As a result, to exploit the
7.5 GHz of spectrum different approaches have been proposed
[2]-[3]. Currently, there are two dominating and technically
very different versions to the UWB technology. One
approach, known as WiMedia UWB, [4]-[5] is based on the
multi-band orthogonal frequency-division multiplexing
(OFDM) modulation technique. The other one is a single-band
impulse-based or Direct Sequence UWB (DS-UWB) radio, as
described in [2]-[3], [6].
The multi-band OFDM specification divides the frequency
spectrum into 500 MHz sub-bands (528 MHz including guard
carriers and 480 MHz without guard carriers). The first three
sub-bands, known as the Band Group 1, cover the spectrum
Manuscript received Nov. 5, 2007. Ericsson AB in Sweden is
Adriana Serban; email: adrse@itn.liu.se, Magnus Karlsson; email:
magka@itn.liu.se, and Shaofang Gong are with Linkping University,
Sweden.
from 3.1 to 4.8 GHz and are centered at 3.432, 3.960, and
4.488 GHz, respectively.
Due to the characteristics of the UWB signals, i.e., very low
radiated power (-41.3 dBm/MHz) and large bandwidth (the
minimum bandwidth is 500 MHz), a multi-band OFDM UWB
receiver requires better receiver sensitivity and a lower noise
figure than, for example, an IEEE 802.11a receiver [7]. The
expected receiver sensitivity is around -70 dBm [8], and it can
be achieved by an optimal design of the low-noise amplifier
(LNA) in terms of wideband, near-to-minimum noise figure
and reasonable power gain. Furthermore, an optimal
integration of the entire RF front-end including the antenna
can also contribute to better receiver sensitivity by minimizing
losses. Another challenge of the UWB front-end design is
caused by the problem of narrowband interferers, e.g., out-ofband, in-band, or other unintentional radiation of electronic
devices [9]. In particular, services around 2.4 and 5 GHz
(IEEE 802.15.1, IEEE 802.11a) can hinder the UWB
communication and must be taken into consideration in UWB
front-end implementation. One solution is to filter
interferences at radio frequencies (RF) before or within the
first amplification stage, e.g., LNA [9]. Two different
techniques can be employed to achieve selective operation in
different band groups or, more restrictive, within each
frequency band group, i.e., in different sub-bands. In [10],
LNAs with selective gain-frequency characteristics employ
multi-resonance load networks which shape the LNA transfer
function. These techniques require area consuming and
complex LC (inductor and capacitor) load networks.
Sometimes, they also need load center frequency control
mechanisms by means of noisy switching pulses.
An alternative approach with a multi-band LNA covering
the multi-band UWB is presented in this paper. It is a
frequency-triplexed RF front-end using one 3.1-4.8 GHz LNA
for Band Group 1 UWB systems. The proposed solution
combines a multi-band pre-selecting filter function with the
frequency multiplexing function to connect the three different
RF inputs to only one LNA. The LNA is optimized for a nearto-minimum noise figure and flat gain response. The RF frontend is completely integrated into a four-metal layer printed
circuit board and is dedicated to a complete integration of
UWB antenna-LNA system on the same RF module.

84
II. OVERVIEW OF THE UWB FRONT-END

The proposed front-end shown in Fig. 1 consists of three
Triplexer
F
M
N
Fig. 3 shows the principle of broadside- and edge-coupling

techniques and the filter structure used in the UWB front-end.
The start and the stop segments are placed on metal layer 1,
while the rest of the filter is placed on metal layer 2.
UWB LNA
Matching
network
Metal 1-2
Broadside coupling
Edge coupling
Matching
network
LNA
Metal 1
Metal 2
Ground
Metal 4
Fig. 1. Block diagram of the proposed multi-band UWB front-end.
RF inputs for connecting antennas, a frequency multiplexing

network (FMN) [11]-[12] and a 3.1-4.8 GHz flat-gain LNA. A
selective multi-band operation is automatically achieved
within the FMN block. The antenna system and the triplexer
have also been studied, and the results are presented in [11][12].
A. Triplexer network
The frequency multiplexer, i.e., the triplexer in this case, is
used between the three RF inputs and the UWB LNA to
simultaneously filter the potential out-of-band and in-band
interferers and for the multiplexing purpose. Fig. 2 shows the
schematic of the proposed triplexer network realized with the
microstrip technology. The triplexer consists of three
bandpass filters, three transmission lines for filter tuning, and
three series quarter-wavelength (/4) transmission lines.
The bandpass filters for multi-band UWB applications
require 500 MHz bandwidth at each center frequency of
3.432, 3.960 and 4.488 GHz. They are implemented as fifth
order broadside- and edge-coupled filters. The filter tuning
lines optimize the stop band impedance of each filter to
provides a high stop band impedance in the neighboring
bands. The three series /4 transmission lines provide a high
impedance at the respective frequency band.
Triplexer output (to LNA)
Port 1
BPF
sub-band #1
Port 2
/4 @
3.432 GHz
BPF
sub-band #2
Port 3
sub-band #3
/4 @
3.960 GHz
Fig. 3. Filter structure: combined broadside- and edge-coupled filter.
The /4 network and the bandpass filters are optimized

simultaneously for uniform passband performance within the
sub-bands. Furthermore, since the sub-bands are so close with
each other, a sharp bandpass transfer function was prioritized
over low reflection for optimal filtering of potential
interference at radio frequencies.
B. UWB LNA
Optimally, wideband LNA design methodologies should
provide improved receiver sensitivity and thus accurate lowlevel signal processing. The UWB LNA design handles tradeoffs among LNA topology selection, wideband matching for
near-to-minimum noise figure, flat power gain, and wideband
bias network design [13]. In addition, as any loss that occurs
before the LNA in the system will substantially degrade the
noise figure of the front-end, the LNA and the antenna system
should be designed simultaneously and preferably integrated
on the same substrate.
The presented UWB LNA is designed for a noise figure
below 4 dB and a flat transfer function over the 3.1-4.8 GHz
bandwidth. The wideband amplifier topology relies on
reactive matching networks of low nodal quality factor [14].
Other classical broadband amplifier topologies, such as
amplifiers using negative-feedback or distributed amplifiers
result in increased noise figure and/or increased consumption
of power and area. Using matching networks implemented
with microstrip lines, the large variation of noise figure and
power gain due to tolerances of the discrete components can
be avoided. The area occupied by distributed matching
networks is in this case not critical as the front-end module
area is dominated by the triplexer area. The simplified
schematic of the UWB LNA is presented in Fig. 4.
Data File component
BPF
Input matching network

Stopband tuning
transmission line
/4 @
3.488 GHz
Output matching network
RF In
RF Out
S2P
Rstab
Fig. 2. Principle of the triplexer.
Microstrip stubs
Fig. 4. Microstrip UWB LNA, simplified schematic.

The active device (MAX2649) of the amplifier is

represented as a two-port network (*.s2p) containing the
measured noise and S-parameters provided by MAXIM Inc.
III. UWB FRONT-END MANUFACTURING AND EVALUATION
A. UWB front-end manufacturing
The manufactured front-end includes a triplexer network
and a UWB LNA. The triplexer photograph is shown in Fig.
5a. Three SMA connectors mounted from the side, Ports 1-3,
connect the three sub-band RF inputs. Port 4 is soldered at the
output of the front-end. The photograph of the LNA,
including the broadband bias network using butterfly radial
stub [13] is presented in Fig. 5b. The UWB LNA is integrated
on the back side of the four-layer board. The LNA input is
connected to the triplexer output by a via hole.
The prototype has a size of 90 x 58 mm, but note that the
actual design only partially fills the printed circuit board. In
addition, separate triplexer and UWB LNA modules were
fabricated.
Broadband
Port 4
bias network
Port 3
Port 2
Port 1
(a) Photo of the front-side.
(b) Photo of the back-side.
Fig. 5. Frequency triplex UWB front-end: (a) photo of the triplexer

implementation, front side, and (b) photo of the UWB LNA, back side.
The three-dimensional (3-D) wideband integration using a

four-layer PCB is a challenging task since the electrical
performance can be degraded by parasitic and radiation losses.
To realize it, electromagnetic (EM) simulations were
performed. All prototypes were manufactured using a four
metal layer printed circuit board. Two dual-layer RO4350B
boards were processed together with a RO4450 prepreg, as
shown in Fig. 6.
S-parameter measurements were done with a
Rhode&Schwartz ZVM vector network analyzer. Agilents
N8974A Noise Figure Analyzer is used to measure the noise
figure of the LNA and of the UWB front-end.
The RO4450 prepreg is made of a sheet material (e.g., glass
fabric) impregnated with a resin cured to an intermediate
stage, ready for multi-layer printed circuit board bonding.
Metal 1: Triplexer
Metal 2: Triplexer
Metal 3: Ground
Metal 4: LNA
RO4350B
RO4450B
RO4350B
Fig. 6. Printed circuit board structure.
Table 1. Printed circuit board parameters

Parameter (Rogers 4350B)
Dielectric height
Dielectric constant
Dissipation factor
Parameter (Rogers 4450B)
Dielectric height
Dielectric constant
Dissipation factor
Parameter (Metal, common)
Metal conductivity
Surface roughness
Dimension
0.254 mm
3.480.05
0.004
Dimension
0.200 mm
3.540.05
0.004
Dimension
0.035 mm
0.025 mm
7
5.8x10 S/m (Copper)
0.001 mm
Table 1 lists the printed circuit board parameters, and Fig. 2

illustrates the stack of the printed circuit board layers. Metal
layers 1 and 4 are thicker than metal layers 2 and 3 because
the surface layers are plated twice while the embedded metal
layers 2 and 3 are plated once.
B. The triplexer
Fig. 7a shows forward transmission |S21| measurement of
the triplexer in the antenna system. A rather flat response is
seen. The transmission line network is optimized together with
the filters to achieve a high blocking of neighboring bands.
The measured total insertion loss is 3.0-3.5 dB for the three
sub-bands. All sub-bands have at least 500 MHz bandwidth at
the -3 dB criterion, i.e., less than 3dB variation within the
desired frequency spectrum. Fig. 7b shows the isolation
between the multiplexed ports. It is seen that the minimum
isolation is -23 dB. The minimum isolation occurs between
the neighboring sub-bands, and in the remaining spectrum the
isolation is better than -23 dB. Furthermore, the isolation
between the non-neighboring sub-bands, |S31|, is 51 dB.
85
Sub-band #1
Sub-band #2
Sub-band #3
0
-10
-20
-30
-40
-50
-60
-70
-80
2
4
Frequency (GHz)
(a) Measured forward transmission.

S21 (dB)
S32 (dB)
S31 (dB)
-20
-30
-40
-50
-60
-70
-80
2
Frequency (GHz)
(b) Measured isolation.

Fig. 7. Measured performance of the triplexer: (a) Forward transmission, and
(b) Isolation.
C. The UWB LNA

Fig. 8a and 8b show the measured performances of the LNA.
The selected topology was optimized for a near-to-minimum
noise figure and the maximum flat power gain. The measured
forward transfer coefficient |S21| is greater than 13 dB with a
0.6 dB variation. Measured noise figure is smaller 4 dB over
the three sub-bands, between 3.1 and almost 4.8 GHz. The
measured noise figure follows the minimum noise figure of
the simulated device with some deviation at the upper
frequency edge. The supply voltage is 3 V and the consumed
current is 13 mA.
Noise figure (dB)
15
Front-end NF
LNA NF
LNA NF-min
10
20
10
0
-10
-20
-30
-40
-50
-60
-70
Sub-band #1
Sub-band #2
Sub-band #3
(a) Simulation of forward transmission of the front-end.

20
10
0
-10
-20
-30
-40
-50
-60
-70
Sub-band #1
Sub-band #2
Sub-band #3
(b) Measured forward transmission of the front-end.
(a) Measured noise figure.
Fig. 9. Triplex LNA system, forward transmission: (a) simulation, and (b)
measurement.
15
10
0
3
Frequency (GHz)
Frequency (GHz)
Frequency (GHz)
0
-10
D. UWB Front-End Evaluation

Fig. 9a shows the forward transmission simulation results
of the UWB front-end, and Fig. 9b shows the corresponding
measurement results. It is seen that all three sub-bands have at
least 500 MHz bandwidth at -3 dB from the top, i.e., the
maximum forward gain within respective sub-band. The
measured overall gain is 3.1-3.5 dB lower than the simulated
gain for the three sub-bands. This is mostly due to slightly
higher insertion loss in the triplexer network than estimated by
the simulations [12] and lower measured LNA gain compared
to the simulated. Figs 9a and 9b also show how the proposed
front-end effectively attenuates the out-of-band signals, e.g.,
the narrowband 2.4 GHz interferers, while it parallely
connects and amplifies the three RF inputs.
86
Frequency (GHz)
(b) Measured forward transmission

Fig. 8. Measured performance of the LNA: (a) measured noise figure, (b)
measured forward transmission.
Fig. 10a and 10b show the front-end noise figure simulation
and measurement results, respectively. The LNA noise figure
and the simulated minimum noise figure are also shown. It is
seen that the measured noise figure for the entire system
within each 500 MHz sub-band is kept below 6 dB at the subband center frequency. However, the measured noise figure of
the front-end is larger than the simulated noise figure. The
larger values of the noise figure compared to the simulated
values can be explained by (a) larger insertion loss in the
triplexer, (b) lower amplifier gain.

87
Noise figure (dB)
20
Front-end NF
LNA NF
LNA NF-min
15
10
5
Fig. 11a shows isolation simulation, between Ports 2-4 of

the UWB front-end, and Fig. 11b shows the corresponding
measurement. It is seen that the minimum measured isolation
is -27 dB. The minimum isolation occurs at the boundary of
the neighboring sub-bands, so in the three passbands the
isolation is better than -27 dB. The isolation between the nonneighboring alternate sub-bands is -52 dB.
0
2
IV. DISCUSSION
Frequency (GHz)
(a) Simulation of noise figure.
Noise figure (dB)
20
Front-end NF
LNA NF
15
10
5
0
2
Frequency (GHz)
(b) Measurement of noise figure.
The measured forward gain was approximately 3.1 dB

lower than the simulated value. This is mostly due to slightly
lower LNA gain and higher insertion loss in the triplexer
network than predicted by the simulations. A small shift in
frequency for all designs is also seen, i.e., approximately a
rather static error of 2.5 %. This is due to the fact that the
simulated electrical length differs from the measured one, i.e.,
the simulated phase velocity is higher than the measured one.
To filter interference at radio frequencies was one of the main
targets of this project. Consequently, better isolation between
the three sub-bands and good attenuation of out-of band
signals was prioritized over noise figure. However, the
minimum noise figure values at the center frequency of each
sub-band are below 6 dB.
Fig. 10. Triplex LNA system, noise figure: (a) simulation, and (b) measured.
V. CONCLUSION
0
S21 (dB)
S32 (dB)
S31 (dB)
-10
-20
-30
-40
-50
-60
-70
-80
2
Frequency (GHz)
(a) Simulation of isolation.

0
S21 (dB)
S32 (dB)
S31 (dB)
-10
-20
In this paper, a new multi-band UWB 3.1-4.8 GHz frontend was presented. It consists of a fully integrated preselective filter and triplexer network and a wideband and flat
gain low-noise amplifier. The UWB front-end is dedicated to
a complete integration of the UWB antenna-LNA system on
the same RF module. Using a microstrip network and three
combined broadside- and edge-coupled bandpass filters, a
multi-band transfer function and RF frequency multiplexing
are achieved simultaneously. The low-noise amplifier has
been designed for a near-to-minimum noise figure and a flat
power gain over the 3.1-4.8 GHz frequency band. The
measured LNA noise figure is below 4 dB while the measured
overall front-end noise figure is below 6 dB. The attenuation
of potential narrowband interference and good isolation
between the three sub-channels are achieved.
REFERENCES
-30
[1]
-40
-50
[2]
-60
[3]
-70
-80
2
Frequency (GHz)
(b) Measurement of isolation.

Fig. 11. Triplex LNA system, isolation: (a) simulation, and (b) measurement.
[4]

ultra-wideband transmission systems FCC., Washington, 2002.
G. R. Aiello and G. D. Rogerson, Ultra wideband wireless systems,
IEEE Microwave Magazine, vol. 4, no. 2, pp. 36-47, Jun. 2003.
L. Yang and G. B. Giannakis, Ultra-wideband communications, an idea
whose time has come, IEEE Signal Processing Magazine, pp. 26-54,
Nov. 2004.
A. Batra, J. Balakrishnan, G. R. Aiello, J. R. Foerster, A. Dabak,
Design of a multiband OFDM system for realistic UWB channel
environments, IEEE Tran. Microwave Theory and Tech., Vol. 52, Issue
9, Part 1, pp. 2123 2138, Sep. 2004.
88
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]

S. Chakraborty, N. R. Belk, A. Batra, M. Goel, A. Dabak, "Towards

fully integrated wideband transceivers: fundamental challenges,
solutions and future," Proc. IEEE Radio-Frequency Integration
Technology: Integrated Circuits for Wideband Communication and
Wireless Sensor Networks 2005, pp. 26-29, Dec. 2005.
M. Z. Win and R. A. Scholtz, Ultra-Wide Bandwidth Time-Hopping
Spread-Spectrum Impulse Radio for Wireless Multiple-Access
Communications, IEEE Transactions on Communications, vol. 48, pp.
679-691, Apr. 2000.
B. Razavi, T. Aytur, C. Lam, F. R. Yang, K. Y. Li, R. H. Yan, H. C.
Hang, C. C. Hsu, C. C. Lee, A UWB CMOS Transceiver, IEEE
Journal of Solid-State Circuits, vol. 40, Issue 12, Dec. 2005, pp. 25552562.
Standard ECMA-368, High Rata Ultra Wideband PHY and Mac
Standard,
1st
Edition
December
2005,
www.ecmainternational.org/publications/files/ECMA-ST/ECMA-368.pdf.
T. W. Fischer, B. Kelleci, K. Shi, A. I. Karilayan, E. Serpedin, "An
analog approach to suppressing in-band narrow-band interference in
UWB receivers," IEEE Transactions on Circuits and Systems, Vol. 54,
No. 5, pp. 941-950, May. 2007.
G. Cusmai, M. Brandolini, P. Rossi, F. Svelto, A 0.18-m CMOS
Selective Receiver Front-End for UWB Applications, IEEE Journal of
Solid-State Circuits, vol. 41, Issue 8, Aug. 2006, pp. 1764-1771.
M. Karlsson, and S. Gong, "A frequency-triplexed inverted-F antenna
system for ultra-wide multi-band Systems 3.1-4.8 GHz," to be published
in ISAST Transactions on Electronics and Signal Processing, 2007.
M. Karlsson, P. Hakansson, S. Gong, A frequency triplexer for ultrawideband systems utilizing combined broadside- and edge-coupled
filters, submitted for publication in IEEE Transactions on Advanced
Packaging, 2007.
A. Serban, M. Karlsson and S. Gong, Microstrip bias networks for
ultra-wideband systems, submitted for publication in ISAST
Transactions on Electronics and Signal Processing, 2007.
G. Gonzalez, Microwave Transistor Amplifier Design. Analysis and
Design, Prentice Hall 1997, pp. 344-345.
Adriana Serban received the M.Sc degree in electronic

engineering from Politehnica University, Bucharest,
Romania.
From 1981 to 1990 she was with Microelectronica
Institute, Bucharest as a Principal Engineer where she
was involved in mixed integrated circuits (ICs) design.
From 1992 to 2002 she was with Siemens AG, Munich,
Germany and with Sicon AB, Linkoping, Sweden as analog and mixed signal
ICs Senior Design Engineer. Since 2002 she is a Lecturer at Linkoping
University teaching in analog/digital system design and RF circuit design. She
works towards her Ph.D degree in Communication Electronics. Her main
research interest has been RF circuit design and high-speed integrated circuit
design.
1977. He received his M.Sc. and Licentiate of
Communication Electronics research group at Linkping
University. His main work involves wideband antennatechniques, wideband transceiver front-ends, and wireless communications.
He received his B.Sc. degree from Fudan University in
Shanghai in 1982, and the Licentiate of Engineering and
Ph.D. degrees from Linkping University in Sweden, in
the microelectronic institute Acreo in Sweden. From 2000 to 2001 he was
the CTO at a spin-off company from the institute. Since 2002 he has been full
professor in communication electronics at Linkping University, Sweden. His
main research interest has been communication electronics including RF
design, wireless communications and high-speed data transmissions.
Regular Paper
89

Blackledge J.M.: Application of the Fractal Market Hypothesis for Macroeconomic
Time Series Analysis
Application of the Fractal Market Hypothesis for

Macroeconomic Time Series Analysis
Jonathan M Blackledge, Fellow, IET, Fellow, IoP, Fellow, IMA, Fellow, RSS
Abstract This paper explores the conceptual background to

nancial time series analysis and nancial signal processing
in terms of the Efcient Market Hypothesis. By revisiting the
principal conventional approaches to market analysis and the
reasoning associated with them, we develop a Fractal Market
Hypothesis that is based on the application of non-stationary
fractional dynamics using an operator of the type
2
q(t)
q(t) q(t)
2
x
t
where 1 is the fractional diffusivity and q is the Fourier
dimension which, for the topology considered, (i.e. the onedimensional case) is related to the Fractal Dimension 1 < DF < 2
by q = 1 DF + 3/2.
We consider an approach that is based on the signal q(t) and
its interpretation, including its use as a macroeconomic volatility
index. In practice, this is based on the application of a moving
window data processor that utilises Orthogonal Linear Regression to compute q from the power spectrum of the windowed
data. This is applied to FTSE close-of-day data between 1980 and
2007 which reveals plausible correlations between the behaviour
of this market over the period considered and the amplitude
uctuations of q(t) in terms of a macroeconomic model that is
compounded in the operator above.
Index Terms Fractional Diffusion Equation, Time Series
Analysis, Macroeconomic Modelling, Volatility Index
I. I NTRODUCTION
HE application of statistical techniques for analysing

nancial time series is a well established practice. This
includes a wide range of stochastic modelling methods and
the use of certain partial differential equations for describing nancial systems (e.g. the Black-Scholes equation for
nancial derivatives). Attempts to develop stochastic models
for nancial time series, which are essentially digital signals
composed of tick data1 [1], [2] can be traced back to the
early Twentieth Century when Louis Bachelier [3] proposed
that uctuations in the prices of stocks and shares (which
appeared to be yesterdays price plus some random change)
could be viewed in terms of random walks in which price
Manuscript received December 1, 2007. The work reported in
this paper was supported by Management and Personnel Services
Limited (http://www.mapstraining.co.uk) and by the Schneider Group
(http://schneidertrading.com).
Jonathan Blackledge (jon.blackledge@btconnect.com) is Visiting Professor,
Department of Electronic and Electrical Engineering, Loughborough University, England (http://www.lboro.ac.uk/departments/el/staff/blackledge.html)
and Extraordinary Professor, Department of Computer Science, University of the Western Cape, Cape Town, Republic of South Africa
(http://www.cs.uwc.ac.za).
1 Data that provides traders with daily tick-by-tick data - time and sales - of
trade price, trade time, and volume traded, for example, at different sampling
rates as required.
changes were entirely independent of each other. Thus, one of

the simplest models for price variation is based on the sum of
independent random numbers. This is the basis for Brownian
motion (i.e. the random walk motion rst observed by the
Scottish Botanist, Robert Brown [4], who, in 1827, noted that
pollen grains suspended in water appear to undergo continuous
jittery motion - a result of the random impacts on the pollen
grains by water molecules) in which the random numbers are
considered to conform to a normal distribution.
With macroeconomic nancial systems, the magnitude of a
change in price du tends to depend on the price u itself. We
therefore need to modify the Brownian random walk model
to include this observation. In this case, the logarithm of the
price change as a function of time t (which is also assumed
to conform to a normal distribution) is modelled according to
the equation
d
dv
du
= dv + dt or
ln u = +
(1)
u
dt
dt
where is the volatility, dv is a sample from a normal
distribution and is a drift term which reects the average
rate of growth of an asset2 . Here, the relative price change of
an asset is equal to a random value plus an underlying trend
component - a log-normal random walk, e.g [5] - [8].
Brownian motion models have the following basic properties: (i) statistical stationarity of price increments in which
samples of Brownian motion taken over equal time increments
can be superimposed onto each other in a statistical sense;
(ii) scaling of price where samples of Brownian motion corresponding to different time increments can be suitably re-scaled
such that they too, can be superimposed onto each other in a
statistical sense. Such models fail to predict extreme behaviour
in nancial time series because of the intrinsic assumption
that such time series conform to a normal distribution, i.e.
Gaussian processes that are stationary in which the statistics the standard deviation, for example - do not change with time.
Random walk models, which underpin the so called Efcient Market Hypothesis (EMH) [9]-[12], have been the basis
for nancial time series analysis since the work of Bachelier
in the late Nineteenth Century. Although the Black-Scholes
equation [13], developed in the 1970s for valuing options, is
deterministic (one of the rst nancial models to achieve determinism), it is still based on the EMH, i.e. stationary Gaussian
statistics. The EMH is based on the principle that the current
price of an asset fully reects all available information relevant
to it and that new information is immediately incorporated
into the price. Thus, in an efcient market, the modelling
2 Note
that both and may very with time t
90

of asset prices is concerned with modelling the arrival of

new information. New information must be independent and
random, otherwise it would have been anticipated and would
not be new. The arrival of new information can send shocks
through the market (depending on the signicance of the
information) as people react to it and then to each others
reactions. The EMH assumes that there is a rational and
unique way to use the available information and that all agents
possess this knowledge. Further, the EMH assumes that this
chain reaction happens effectively instantaneously. These
assumptions are clearly questionable at any and all levels of
a complex nancial system.
The EMH implies independence of price increments and is
typically characterised by a normal of Gaussian Probability
Density Function (PDF) which is chosen because most price
movements are presumed to be an aggregation of smaller
ones, the sums of independent random contributions having a
Gaussian PDF. However, it has long been known that nancial
time series do not follow random walks. An illustration of
this is given in Figure 1 which shows a (discrete) nancial
signal u(t) (data obtained from [14]), the log derivative of this
signal d log u(t)/dt and a Gaussian distributed random signal.
The log derivative is considered in order to: (i) eliminate the
characteristic long term exponential growth of the signal; (ii)
obtain a signal on the daily price differences3 in accord with
the left hand side term of equation (1). Clearly, there is a
marked difference in the characteristics of a real nancial
signal and a random Gaussian signal. This simple comparison
indicates a failure of the statistical independence assumption
which underpins the EMH.
The shortcomings of the EMH model (as illustrated in
Figure 1) include: failure of the independence and Gaussian
distribution of increments assumption, clustering, apparent
non-stationarity and failure to explain momentous nancial
events such as crashes leading to recession and, in some
extreme cases, depression. These limitations have prompted a
new class of methods for investigating time series obtained
from a range of disciplines. For example, Re-scaled Range
Analysis (RSRA), e.g. [15], [16], which is essentially based
on computing the Hurst exponent [17], is a useful tool for
revealing some well disguised properties of stochastic time series such as persistence (and anti-persistence) characterized by
non-periodic cycles. Non-periodic cycles correspond to trends
that persist for irregular periods but with a degree of statistical
regularity often associated with non-linear dynamical systems.
RSRA is particularly valuable because of its robustness in the
presence of noise. The principal assumption associated with
RSRA is concerned with the self-afne or fractal nature of the
statistical character of a time-series rather than the statistical
signature itself. Ralph Elliott rst reported on the fractal
properties of nancial data in 1938 (e.g. [18] and reference
therein). He was the rst to observe that segments of nancial
time series data of different sizes could be scaled in such a way
that they were statistically the same producing so called Elliot
waves. Since then, many different self-afne models for price
variation have been developed, often based on (dynamical)
3 The
gradient is computed using forward differencing.
Fig. 1. Financial time series for the FTSE value (close-of-day) from 02-041984 to 12-12-2007 (top), the log derivative of the same time series (centre)
and a Gaussian distributed random signal (bottom).
Iterated Function Systems (IFS). These models can capture

many properties of a nancial time series but are not based
on any underlying causal theory of the type attempted in this
paper.
A good stochastic nancial model should ideally consider
all the observable behaviour of the nancial system it is
attempting to model. It should therefore be able to provide
some predictions on the immediate future behaviour of the
system within an appropriate condence level. Predicting the
markets has become (for obvious reasons) one of the most
important problems in nancial engineering. Although, at least
in principle, it might be possible to model the behaviour of
each individual agent operating in a nancial market, one
can never be sure of obtaining all the necessary information
required on the agents themselves and their modus operandi.
This principle plays an increasingly important role as the
scale of the nancial system, for which a model is required,
increases. Thus, while quasi-deterministic models can be of
value in the understanding of micro-economic systems (with
known operational conditions), in an ever increasing global
economy (in which the operational conditions associated with
the scal policies of a given nation state are increasingly open),
we can take advantage of the scale of the system to describe
its behaviour in terms of functions of random variables.
II. M ARKET A NALYSIS
The stochastic nature of nancial time series is well known
from the values of the stock market major indices such as
the FTSE (Financial Times Stock Exchange) in the UK, the
91

Fig. 2. Evolution of the 1987, 1997 and 2007 nancial crashes. Normalised
plots (i.e. where the data has been rescaled to values between 0 and 1
inclusively) of the daily FTSE value (close-of-day) for 02-04-1984 to 24-121987 (top), 05-04-1994 to 24-12-1997 (centre) and 02-04-2004 to 24-09-2007
(bottom)
Dow Jones in the US which are frequently quoted. A principal

aim of investors is to attempt to obtain information that can
provide some condence in the immediate future of the stock
markets often based on patterns of the past, patterns that are
ultimately based on the interplay between greed and fear. One
of the principle components of this aim is based on the observation that there are waves within waves and events within
events that appear to permeate nancial signals when studied
with sufcient detail and imagination. It is these repeating
patterns that occupy both the nancial investor and the systems
modeller alike and it is clear that although economies have
undergone many changes in the last one hundred years, the
dynamics of market data do not appear to change signicantly
(ignoring scale). For example, Figure 2 shows the build up to
three different crashes, the one of 1987 and that of 1997
(both after approximately 900 days) and what may turn out
to be a crash of 2007 (at the time of writing this paper).
The similarity in behaviour of these signals is remarkable and
is indicative of the quest to understand economic signals in
terms of some universal phenomenon from which appropriate
(macro) economic models can be generated. In an efcient
market, only the revelation of some dramatic information can
cause a crash, yet post-mortem analysis of crashes typically
fail to (convincingly) tell us what this information must have
been.
In modern economies, the distribution of stock returns
and anomalies like market crashes emerge as a result of
considerable complex interaction. In the analysis of nancial
time series, it is inevitable that assumptions need to be made
to make the derivation of a model possible. This is the most
vulnerable stage of the process. Over simplifying assumptions

lead to unrealistic models. There are two main approaches
to nancial modelling: The rst approach is to look at the
statistics of market data and to derive a model based on an
educated guess of the mechanics of the market. The model
can then be tested using real data. The idea is that this process
of trial and error helps to develop the right theory of market
dynamics. The alternative is to reduce the problem and try to
formulate a microscopic model such that the desired behaviour
emerges, again, by guessing agents strategic rules. This
offers a natural framework for interpretation; the problem is
that this knowledge may not help to make statements about the
future unless some methods for describing the behaviour can
be derived from it. Although individual elements of a system
cannot be modelled with any certainty, global behaviour can
sometimes be modelled in a statistical sense provided the
system is complex enough in terms of its network of interconnection and interacting components.
In complex systems, the elements adapt to the aggregate
pattern they co-create. As the components react, the aggregate
changes, as the aggregate changes the components react anew.
Barring the reaching of some asymptotic state or equilibrium,
complex systems keep evolving, producing seemingly stochastic or chaotic behaviour. Such systems arise naturally in the
economy. Economic agents, be they banks, rms, or investors,
continually adjust their market strategies to the macroscopic
economy which their collective market strategies create. It
is important to appreciate that there is an added layer of
complexity within the economic community: Unlike many
physical systems, economic elements (human agents) react
with strategy and foresight by considering the implications
of their actions (some of the time!). Although we can not
be certain whether this fact changes the resulting behaviour,
we can be sure that it introduces feedback which is the
very essence of both complex systems and chaotic dynamical
systems that produce fractal structures.
The link between dynamical systems, chaos and the economy is an important one because it is dynamical systems that
illustrate that local randomness and global determinism can
co-exist. Global determinism can be considered, at least in
a qualitative sense, in terms of broad social issues and the
reaction of distinct groups to changing social attitudes, particularly in economies that have traditionally been enhanced by
an open and often pro-active policy towards the immigration
of peoples from diverse cultural backgrounds. For example, in
1656, Cromwell permitted an open door policy to immigration
from continental Europe, partly in an attempt to enhance the
economy of England that had been severely compromised by
the English Civil wars of 1642-46 and 1648-49 [19]. The long
term effect of this was to provide a new nancial infrastructure
that laid the foundations for future economic development.
It is arguable that Cromwells policy is the principal reason
why the English revolution of the Eighteenth Century was
primarily an industrial one. Issues concerning the current and
future economic welfare of England may then be appreciated
in terms of the attitudes and values associated with new
waves of immigrants and the policy of appeasement adopted
at government level.
92

Complex systems can be split into two categories: equilibrium and non-equilibrium. Equilibrium complex systems,
undergoing a phase transition, can lead to critical states that
often exhibit random fractal structures in which the statistics of
the eld are scale invariant. For example, when ferromagnets
are heated, as the temperature rises, the spins of the electrons
which contribute to the magnetic eld gain energy and begin
to change in direction. At some critical temperature, the spins
form a random vector eld with a zero mean and a phase
transition occurs in which the magnetic eld averages to zero.
But the eld is not just random, it is a self-afne random eld
whose statistical distribution is the same at different scales,
irrespective of the characteristics of the distribution. Nonequilibrium complex systems or driven systems give rise to
self organised critical states, an example is the growing of
sand piles. If sand is continuously supplied from above, the
sand starts to pile up. Eventually, little avalanches will occur
as the sand pile inevitably spreads outwards under the force of
gravity. The temporal and spatial statistics of these avalanches
are scale invariant.
Financial markets can be considered to be non-equilibrium
systems because they are constantly driven by transactions that
occur as the result of new fundamental information about rms
and businesses. They are complex systems because the market
also responds to itself, often in a highly non-linear fashion, and
would carry on doing so (at least for some time) in the absence
of new information. The price change eld is highly nonlinear and very sensitive to exogenous shocks and it is probable
that all shocks have a long term effect. Market transactions
generally occur globally at the rate of hundreds of thousands
per second. It is the frequency and nature of these transactions
that dictate stock market indices, just as it is the frequency and
nature of the sand particles that dictates the statistics of the
avalanches in a sand pile. These are all examples of random
scaling fractals [20]-[28].
III. D OES A M ACROECONOMY HAVE MEMORY ?
When faced with a complex process of unknown origin, it
is usual to select an independent process such as Brownian
motion as a working hypothesis where the statistics and probabilities can be estimated with great accuracy. However, using
traditional statistics to model the markets assumes that they are
games of chance. For this reason, investment in securities is
often equated with gambling. In most games of chance, many
degrees of freedom are employed to ensure that outcomes are
random. In the case of a simple dice, a coin or roulette wheel,
for example, no matter how hard you may try, it is physically
impossible to master your roll or throw such that you can
control outcomes. There are too many non-repeatable elements
(speeds, angles and so on) and non-linearly compounding
errors involved. Although these systems have a limited number
of degrees of freedom, each outcome is independent of the
previous one. However, there are some games of chance that
involve memory. In Blackjack, for example, two cards are dealt
to each player and the object is to get as close as possible to
21 by twisting (taking another card) or sticking. In a bust
(over 21), the player loses; the winner is the player that stays
closest to 21. Here, memory is introduced because the cards

are not replaced once they are taken. By keeping track of
the cards used, one can assess the shifting probabilities as
play progresses. This game illustrates that not all gambling
is governed by Gaussian statistics. There are processes that
have long-term memory, even though they are probabilistic
in the short term. This leads directly to the question, does
the economy have memory? A system has memory if what
happens today will affect what happens in the future.
Memory can be tested by observing correlations in the
data. If the system today has no affect on the system at any
future time, then the data produced by the system will be
independently distributed and there will be no correlations. A
function that characterises the expected correlations between
different time periods of a nancial signal u(t) is the AutoCorrelation Function (ACF) dened by
A(t) = u(t)
u( )u( t)d.
u(t) =
where denotes that the correlation operation. This function

can be computed either directly (evaluation of the above
integral) or via application of the power spectrum using the
correlation theorem
u(t)
u(t) | U () |2
where denotes transformation from real space t to Fourier

space (the angular frequency), i.e.
U () = F[u(t)] =
u(t) exp(it)dt
where F denotes the Fourier transform operator. The power

spectrum | U () |2 characterises the amplitude distribution of
the correlation function from which we can estimate the time
span of memory effects. This also offers a convenient way to
calculate the correlation function (by taking the inverse Fourier
transform of | U () |2 ). If the power spectrum has more
power at low frequencies, then there are long time correlations
and therefore long-term memory effects. Inversely, if there is
greater power at the high frequency end of the spectrum, then
there are short-term time correlations and evidence of shortterm memory. White noise, which characterises a time series
with no correlations over any scale, has a uniformly distributed
power spectrum.
Since prices movements themselves are a non-stationary
process, there is no ACF as such. However, if we calculate
the ACF of the price increments du/dt, then we can observe
how much of what happens today is correlated with what
happens in the future. According to the EMH, the economy
has no memory and there will therefore be no correlations,
except for today with itself. We should therefore expect the
power spectrum to be effectively constant and the ACF to be
a delta function. The power spectra and the ACFs of log price
changes d log u/dt and their absolute value | d log u/dt | for
the FTSE 100 index (daily close) from 02-04-1984 to 24-092007 is given in Figure 3. The power spectra of the data is
not constant with rogue spikes (or groups of spikes) at the
intermediate and high frequency portions of the spectrum. For
93

Fig. 3. Log-power spectra and ACFs of log price changes and absolute
log price changes for FTSE 100 index (daily close) from 02-04-1984 to 2409-2007. Top-left: log price changes; top-right: absolute value of log price
changes; middle: log power spectra; bottom: ACFs.
the absolute log price increments, there is evidence of a power

law at the low frequency end, indicating that there is additional
correlation in the signs of the data.
The ACF of the log price changes is relatively featureless,
indicating that the excess of low frequency power within the
signal has a fairly subtle effect on the correlation function.
However, the ACF of the absolute log price changes contains
a number of interesting features. It shows that there are
a large number of short range correlations followed by an
irregular decline up to approximately 1500 days after which
the correlations start to develop again, peaking at about 2225
days. The system governing the magnitudes of the log price
movements clearly has a better long-term memory than it
should. The data used in this analysis contains 5932 daily price
movements and it is therefore improbable that these results are
coincidental and correlations of this, or any similar type, what
ever the time scale, effectively invalidates the independence
assumption of the EMH.
IV. S TOCHASTIC M ODELLING OF M ACROECONOMIC DATA
Developing mathematical models to simulate stochastic
processes has an important role in nancial analysis and
information systems in general where it should be noted that
information systems are now one of the most important aspects
in terms of regulating nancial systems, e.g. [29]-[32]. A good
stochastic model is one that accurately predicts the statistics
we observe in reality, and one that is based upon some well
dened rationale. Thus, the model should not only describe
the data, but also help to explain and understand the system.
There are two principal criteria used to dene the characteristics of a stochastic eld: (i) The PDF or the Characteristic
Function (i.e. the Fourier transform of the PDF); the Power

Spectral Density Function (PSDF). The PSDF is the function
that describes the envelope or shape of the power spectrum of
a signal. In this sense, the PSDF is a measure of the eld
correlations. The PDF and the PSDF are two of the most
fundamental properties of any stochastic eld and various
terms are used to convey these properties. For example, the
term zero-mean white Gaussian noise refers to a stochastic
eld characterized by a PSDF that is effectively constant over
all frequencies (hence the term white as in white light) and
has a PDF with a Gaussian prole whose mean is zero.
Stochastic elds can of course be characterized using transforms other than the Fourier transform (from which the PSDF
is obtained) but the conventional PDF-PSDF approach serves
many purposes in stochastic systems theory. However, in
general, there is no general connectivity between the PSDF
and the PDF either in terms of theoretical prediction and/or
experimental determination. It is not generally possible to
compute the PSDF of a stochastic eld from knowledge of
the PDF or the PDF from the PSDF. Hence, in general, the
PDF and PSDF are fundamental but non-related properties
of a stochastic eld. However, for some specic statistical
processes, relationships between the PDF and PSDF can
be found, for example, between Gaussian and non-Gaussian
fractal processes [33] and for differentiable Gaussian processes
[34].
There are two conventional approaches to simulating a
stochastic eld. The rst of these is based on predicting the
PDF (or the Characteristic Function) theoretically (if possible).
A pseudo random number generator is then designed whose
output provides a discrete stochastic eld that is characteristic
of the predicted PDF. The second approach is based on
considering the PSDF of a eld which, like the PDF, is ideally
derived theoretically. The stochastic eld is then typically
simulated by ltering white noise. A good stochastic model
is one that accurately predicts both the PDF and the PSDF
of the data. It should take into account the fact that, in
general, stochastic processes are non-stationary. In addition, it
should, if appropriate, model rare but extreme events in which
signicant deviations from the norm occur.
New market phenomenon result from either a strong theoretical reasoning or from compelling experimental evidence
or both. In econometrics, the processes that create time series
such as the FTSE have many component parts and the interaction of those components is so complex that a deterministic
description is simply not possible. As in all complex systems
theory, we are usually required to restrict the problem to
modelling the statistics of the data rather than the data itself,
i.e. to develop stochastic models. When creating models of
complex systems, there is a trade-off between simplifying
and deriving the statistics we want to compare with reality
and simulating the behaviour through an emergent statistical
behaviour. Stochastic simulation allows us to investigate the
effect of various traders behavioural rules on the global
statistics of the market, an approach that provides for a natural
interpretation and an understanding of how the amalgamation
of certain concepts leads to these statistics.
One cause of correlations in market price changes (and

94
volatility) is mimetic behaviour, known as herding. In general,

market crashes happen when large numbers of agents place sell
orders simultaneously creating an imbalance to the extent that
market makers are unable to absorb the other side without
lowering prices substantially. Most of these agents do not
communicate with each other, nor do they take orders from
a leader. In fact, most of the time they are in disagreement,
and submit roughly the same amount of buy and sell orders.
This is a healthy non-crash situation; it is a diffusive (randomwalk) process which underlies the EMH and nancial portfolio
rationalization.
One explanation for crashes involves a replacement for the
EMH by the Fractal Market Hypothesis (FMH) which is the
basis of the model considered in this paper. The FMH proposes
the following: (i) The market is stable when it consists of
investors covering a large number of investment horizons
which ensures that there is ample liquidity for traders; (ii)
information is more related to market sentiment and technical
factors in the short term than in the long term - as investment
horizons increase and longer term fundamental information
dominates; (iii) if an event occurs that puts the validity
of fundamental information in question, long-term investors
either withdraw completely or invest on shorter terms (i.e.
when the overall investment horizon of the market shrinks
to a uniform level, the market becomes unstable); (iv) prices
reect a combination of short-term technical and long-term
fundamental valuation and thus, short-term price movements
are likely to be more volatile than long-term trades - they are
more likely to be the result of crowd behaviour; (v) if a security
has no tie to the economic cycle, then there will be no longterm trend and short-term technical information will dominate.
Unlike the EMH, the FMH states that information is valued
according to the investment horizon of the investor. Because
the different investment horizons value information differently,
the diffusion of information will also be uneven. Unlike most
complex physical systems, the agents of the economy, and
perhaps to some extent the economy itself, have an extra
ingredient, an extra degree of complexity. This ingredient is
consciousness.
V. R ANDOM WALK P ROCESSES
The purpose of revisiting random walk processes is that
it provides a useful conceptual reference to the model that is
introduced later on in this paper and in particular, appreciation
of the use of the fractional diffusion equation for describing
self-afne stochastic elds, an equation that arises through the
unication of coherent and incoherent random walks. We shall
consider a random walk in the plane where the amplitude
remains constant but where the phase changes, rst by a
constant factor and then by a random value between 0 and
2.
A. Coherent (Constant) Phase Walks
Consider a walk in the (real) plane where the length from
one step to another is constant - the amplitude a - and where
the direction that is taken after each step is the same. In this
simple case, the walker continues in a straight line and after
n steps the total length of the path the walker has taken will
be just an. We dene this value as the resultant amplitude A
- the total length of the walk - which will change only by
account of the number of steps taken. Thus,
A = an.
If each step takes a set period of time t to complete, then it
is clear that
A(t) = at.
This scenario is limited by the fact that we are assuming that
each step is of precisely the same length and takes precisely the
same period of time to accomplish. In general, we consider
a to be the mean value of all the step lengths and t to be
the cumulative time associated with the average time taken to
perform all steps. A walk of this type has a coherence from
one step or cluster of steps to the next, is entirely predictable
and correlated in time.
If the same walk takes place in the complex plane then the
phase from one step to the next is the same. Thus, the result
is given by
A exp(i) =
a exp(i) = na exp(i).
n
The resultant amplitude is given by na as before and the total

phase value is . We can also dene the intensity which is
given by
I =| A exp(i) |2 = A2
Thus, as a function of time, the intensity associated with this
coherent phase walk is given by
I(t) = a2 t2 .
Suppose we make the walk slightly more complicated and
consider the case where the phase increases by a small constant
factor of at each step. After n steps, the result will be given
by the sum of all the steps taken, i.e.
A exp(i) =
a exp(in)
n
= a[1 + exp(i) + exp(2i) + ... + exp[i(n 1)]

[1 exp(in)]
=a
[1 exp(i)]
exp(in/2)[exp(in/2) exp(in/2)]
=a
exp(i/2)[exp(i/2) exp(i/2)]
sin(n/2)
= a exp[i(n 1)/2)]
.
sin /2
Now, after many steps, when n is large,
= (n 1)/2
n/2
and when the phase change is small,
sin(/2)
2
n
and we obtain the result
sin
.
For very small changes in the phase << 1, sinc 1 and

the resultant amplitude A is, as before, given by an or as a
function of time, by at.
A exp(i) = na exp[i((n 1)/2)]sinc, sinc =

95
B. Incoherent (Random) Phase Walk

Incoherent or random phase walks are the basis of modelling
many kinds of statistical uctuations. It is also the principle
physical model associated with the stochastic behaviour of
an ensemble of particles that collectively exhibit the process
of diffusion. The rst quantitative description of Brownian
motion was undertaken by Albert Einstein and published in
1905 [35]. The basic idea is to consider a random walk in
which the mean value of each step is a but where there is no
correlation in the direction of the walk from one step to the
next. That is, the direction taken by the walker from one step
to next can be in any direction described by an angle between
0 and 360 degrees or 0 and 2 radians - for a walk in the
plane. The angle that is taken at each step is entirely random
and all angles are taken to be equally likely. Thus, the PDF
of angles between 0 and 2 is given by
Pr[] =
1
2 ,
0 2;
otherwise.
0,
If we consider the random walk to take place in the complex

plane, then after n steps, the position of the walker will be
determined by a resultant amplitude A and phase angle
given by the sum of all the steps taken, i.e.
In this case, A is proportional to the square root of the number

of steps taken and if each step is taken over a mean time
period, then we obtain the result
A(t) = a t.
With a coherent walk we can state that the resulting amplitude
after a time t will be at. This is a deterministic result.
However, with an incoherent random walk, the interpretation
of the above result is that a t is the amplitude associated with

the most likely position that the random walker will be after
time t. If we imagine many random walkers, each starting out
on their journey from the origin of the (complex) plane at
t = 0, record the distances from the origin of this plane after a
set period of time t, then the PDF of A will have a maximum
value - the mode of the distribution - that occurs at a t. In

the case of a perfectly coherent walk, the PDF will consist of
a unit spike that occurs at at.
Figure 4 shows coherent and a incoherent phase walks in the
plane. Each position of the walk (xj , yj ), j = 1, 2, 3, ..., N
has been computed using (for a = 1)
j
xj =
cos(i )
i=1
j
A exp(i) = a exp(i1 ) + a exp(i2 ) + ... + a exp(in )
yj =
exp(im ).
=a
m=1
The problem is to obtain a scaling relationship between A

and n. Clearly we should not expect A to be proportional to
the number of steps n as is the case with a coherent walk.
The trick to nding this relationship is to analyse the result
of taking the square modulus of A exp(i). This provides an
expression for the intensity I given by
I = a2
exp(im )
m=1
exp(im )
m=1
exp(im )
m=1
= a2 n +
exp(ij )
j=1,j=k
where i [0, 2] is uniformly distributed and computed

using the standard linear congruential pseudo random number
generator
xi+1 = axi modP, i = 1, 2, ..., N
(2)
with a = 77 and P = 231 1 and an arbitrary value of x0 the seed. For the coherent phase walk
i =
= a2
sin(i )
i=1
2 xi
16 x
which limits the angle to a small range between 0 and /8

radians4 . For the incoherent phase walk, the range of values
is between 0 and 2 radians, i.e.
xi
i = 2
x
exp(ik ) .
k=1
VI. P HYSICAL I NTERPRETATION

Now, in a typical term
exp(ij ) exp(ik ) = cos(j k ) + i sin(j k )
of the double summation, the functions cos(j k ) and
sin(j k ) have random values between 1. Consequently,
as n becomes larger and larger, the double sum will reduces
to zero since more and more of these terms cancel each other
out. This insight is the basis for stating that for n >> 1
I = a2 n
and the resulting amplitude is therefore given by
A = a n.
In the (classical) kinetic theory of matter (including gases,

liquids, plasmas and some solids), we consider a to be the
average distance a particle travels before it randomly collides
and scatters from another particle. The scattering process is
taken to be entirely elastic, i.e. the interaction does not affect
the particle in any way other than to change the direction in
which it travels. Thus, a represents the mean free path of a
particle. The mean free path is a measure how far a particle can
travel before scattering with another particle which in turn, is
related to the number of particle per unit volume - the density
4 x
denote the uniform norm, equivalent to the maximum value of the
array vector x.
96

system. To a rst order approximation, the diffusivity will

depend on the number of sites that are required to manage the
reception and transmission of the information packet. As the
number of sites decreases the ow of information becomes
more propagative and less diffusive. Thus, we can consider
the Internet, for example (albeit a good one) to be a source
of information diffusion, not in terms of the diffusion of
the information it coveys but in terms of the way in which
information packets walk through the network. Further, we
can think of the internet itself as being an active medium
for the propagation of nancial information from one site to
another.
A. The Classical Diffusion Equation
The homogeneous diffusion equation is given by (for the
one-dimension case x) [36]
x2
t
Fig. 4. Examples of a coherent (top) and incoherent (bottom) random walk

in the plane for N = 100.
of a gas for example. If we imagine a particle diffusing

through an ensemble of particles, then the mean free path
is a measure of the diffusivity of the medium in which
the process of diffusion takes place. This is a feature of all
classical diffusion processes which can be formulated in terms
of the diffusion equation with diffusivity D. The dimensions of
diffusivity are length2 /time and may be interpreted in terms
of the characteristic distance of a random walk process which
varies with the square root of time.
If we consider a wavefront travelling through space and
scattering from a site that changes the direction of propagation,
then the mean free path can be taken to be the average number
of wavelengths taken by the wavefront to propagate from one
interaction to another. After scattering from many sites, the
wavefront can be considered to have diffused through the
diffuser. Here, the mean free path is a measure of the density
of scattering sites, which in turn, is a measure of the diffusivity
of the material - an optical diffuser, for example.
We can use the random walk model associated with a
waveeld to interpret the ow of information through a
complex network of sites that are responsible for passing
on the information from one site to the next. If a packet of
information (e.g. a stream of bits of arbitrary length) travels
directly from A to B, then, in terms of the random walk models
discussed above, the model associated with this information
exchange is propagative; it is a coherent process which is
correlated in time and its principal physical characteristic is
determined by the speed at which the information ows from
A to B. On the other hand, suppose that this information packet
is transferred from A to B via information interchange sites C,
D,...,Z,... In this case the ow of information is diffusive and is
characterised by the diffusivity of the information interchange
u(x, t) = 0
for a diffusivity D = 1 . The eld u(x, t) represents a measurable quantity whose space-time dependence is determined
by the random walk of a large ensemble of particles or a
multiple scattered waveeld or information owing through a
complex network. We consider an initial value for this eld
denoted by u0 u(x, 0) = u(x, t) at t = 0. For example, u
could be the temperature of a material that starts radiating
heat at time t = 0 from a point in space x due to a mass
of thermally energised particles, each of which undertakes
a random walk from the source of heat in which the most
likely position of any particle after a time t is proportional to
t. In optical diffusion, for example, u denotes the intensity

of light. The light waveeld is taken to be composed of an
ensemble of wavefronts or rays, each of which undergoes
multiple scattering as it propagates through the diffuser. For a
single wavefront element, multiple scattering is equivalent to
a random walk of that element.
The relationship between a random walk model and the
diffusion equation can also be attributed to Einstein [35]
who derived the diffusion equation using a random particle
model system assuming that the movements of the particles
are independent of the movements of all other particles and
that the motion of a single particle at some interval of time is
independent of its motion at all other times. The derivation is
as follows: Let be a small interval of time in which a particle
moves some distance between and + d with a probability
P () where is long enough to assume that the movements
of the particle in two separate periods of are independent. If
n is the total number of particles and we assume that P () is
constant between and + d, then the number of particles
which will travel a distance between and + d in is
given by
dn = nP ()d.
If u(x, t) is the concentration (number of particles per unit
volume) then the concentration at time t + is described by
the integral of the concentration of particles which have been

97
displaced by in time , as described by the equation above,

over all possible , i.e.
Since, is assumed to be small, we can approximate u(x, t+ )

using the Taylor series and write
Similarly, using a Taylor series expansion of u(x + , t), we

have
u(x + , t)
u(x, t) +
2 2
u(x, t)
u(x, t) +
x
2! x2
where the higher order terms are neglected under the assumption that if is small, then the distance travelled, , must also
be small. We can then write
u(x, t) +
u(x, t) = u(x, t)
t
P ()d
+ u(x, t)
x
1 2
P ()d +
u(x, t)
2 x2
2 P ()d.
For isotropic diffusion, P () = P () and so P is an even

function with usual normalization condition
P ()d = 1.
As is an odd function, the product P () is also an odd

function which, if integrated over all values of , equates to
zero. Thus we can write
1 2
u(x, t) + u(x, t) = u(x, t) +
u(x, t)
t
2 x2
so that
2
u(x, t) =
u(x, t)
t
x2
2
P ()d.
2
Finally, dening the diffusivity as
D=
2
P ()d
2
we obtain the diffusion equation
2
u(x, t) = D 2 u(x, t).
t
x
u(x, t) = 0
where c is the wave speed and u denotes the amplitude of the

waveeld. A possible solution to this equation is
u(x, t) = p(x ct)
u(x, t) + u(x, t).

t
u(x, t + )
The wave equation (homogeneous form) is given by (for

the one-dimension case) [36]
1 2
2
2 2
x2
c t
u(x + , t)P ()d.
u(x, t + ) =
B. The Classical Wave Equation
which describes a wave with distribution p moving along x at

velocity c. For the initial value problem where
u(x, 0) = w(x)
t
the (dAlembert) general solution is given by [36]
u(x, 0) = v(x),
x+ct
1
1
u(x, t) = [v(x ct) + v(x + ct)] +
2
2c
w()d.
xct
This solution is of limited use in that the range of x is

unbounded and only applies to the case on an innite string.
For the case when w = 0, the solution can be taken to describe
two identical waves with amplitude distribution v(x) travelling
away from each other. Neither wave is taken to undergo any
interaction as it travels along a straight path and thus, after
time t the distance travelled will be ct. This is analogous
to a walker undertaking a perfectly coherent walk with an
average step length of c and after a period of time t reaching
a position ct. The point here, is that we can relate the diffusion
equation and the wave equation to two types of processes. The
diffusion equation describes a eld generated by incoherent
random processes with no time correlations whereas the wave
equation describes a eld generated by coherent processes that
are correlated in time. One of the aims of this paper is to
formulate an equation that models the intermediate case - the
fractional diffusion equation - in which random walk process
have a directional bias.
VII. H URST P ROCESSES
2 P ()d
For a walk in the plane, A(t) = at for a coherent walk and
A(t) = a t for an incoherent walk. However, what would

be the result if the walk was neither coherent or incoherent
but partial coherent/incoherent? In other words, suppose the
random walk exhibited a bias with regard to the distribution
of angles used to change the direction. What would be the
effect on the scaling law t? Intuitively, one expects that

as the distribution of angles reduces, the corresponding walk
becomes more and more coherent, exhibiting longer and longer
time correlations until the process conforms to a fully coherent
walk. A simulation of such an effect is given in Figure 5 which
shows a random walk in the (real) plane as the (uniform)
distribution of angles decreases. The walk becomes less and
less random as the width of the distribution is reduced.
The equivalent effect for a random phase walk in threedimensions is given in Figure 6. Each position of the walk
(xj , yj , zj ), j = 1, 2, 3, ..., N

98
has been computed using

j
xj =
cos(i ) cos(i )
i=1
j
yj =
sin(i ) cos(i )
i=1
j
zj =
sin(i )
i=1
for N = 500. The uniform random number generator used

to compute i and i is the same - equation (2) - but with
different seeds. Conceptually, scaling models associated with
the intermediate
case(s) should be based on a generalisation of
the scaling laws t and t to the form tH where 0.5 H < 1.
This reasoning is the basis for generalising the random walk
processes considered so far, the exponent H being known as
the Hurst exponent or dimension.
Fig. 6. Three dimensional random phase walks for a uniform distribution of
angles (i , i ) ([0, 2], [0, 2]) (top left), (i , i ) ([0, 1.6], [0, 1.6])
(top right), (i , i ) ([0, 1.3], [0, 1.3]) (bottom left) and (i , i )
([0, ], [0, ]) (bottom right).
Fig. 5. Random phase walks in the plane for a uniform distribution of angles
i [0, 2] (top left), i [0, 1.9] (top right), i [0, 1.8] (bottom left)
and i [0, 1.2] (bottom right).
H E Hurst (1900-1978) was an English civil engineer who

designed dams and worked on the Nile river dam projects in
the 1920s and 1930s. He studied the Nile so extensively that
some Egyptians reportedly nicknamed him the father of the
Nile. The Nile river posed an interesting problem for Hurst
as a hydrologist. When designing a dam, hydrologists need
to estimate the necessary storage capacity of the resulting
reservoir. An inux of water occurs through various natural
sources (rainfall, river overows etc.) and a regulated amount
needs to be released for primarily agricultural purposes, for
example, the storage capacity of a reservoir being based on
the net water ow. Hydrologists usually begin by assuming
that the water inux is random, a perfectly reasonable assumption when dealing with a complex ecosystem. Hurst,
however, had studied the 847-year record that the Egyptians

had kept of the Nile river overows, from 622 to 1469. He
noticed that large overows tended to be followed by large
overows until abruptly, the system would then change to low
overows, which also tended to be followed by low overows.
There appeared to be cycles, but with no predictable period.
Standard statistical analysis of the day revealed no signicant
correlations between observations, so Hurst, who was aware
of Einsteins work on Brownian motion, developed his own
methodology [37] lead to the scaling law tH . This scaling law
makes no prior assumptions about any underlying distributions. It simply tells us how the system is scaling with respect
to time. So how do we interpret the Hurst exponent? We know
that H = 0.5 is consistent with an independently distributed
system. The range 0.5 < H 1, implies a persistent time
series, and a persistent time series is characterized by positive
correlations. Theoretically, what happens today will ultimately
have a lasting effect on the future. The range 0 < H 0.5
indicates anti-persistence which means that the time series
covers less ground than a random process. In other words,
there are negative correlations. For a system to cover less
distance, it must reverse itself more often than a random
process.
VIII. L E VY P ROCESSES
The generalisation of Einsteins equation A(t) = a t by

Hurst to the form A(t) = atH , 0 < H 1 was necessary in
order for Hurst to analyse the apparent random behaviour of
the annual rise and fall of the Nile river for which Einsteins
model was inadequate. In considering this generalisation,
Hurst paved the way for an appreciation that most natural
stochastic phenomena which, at rst site, appear random, have
certain trends that can be identied over a given period of

99
time. In other words, many natural random patterns have a

bias to them that leads to time correlations in their stochastic
behaviour, a behaviour that is not an inherent characteristic of
a random walk model and fully diffusive processes in general.
This aspect of stochastic eld theory was taken up in the late
1930s by the French mathematician Paul L vy (1886-1971)
e
[38].
L vy processes are random walks whose distribution has
e
innite moments. The statistics of (conventional) physical
systems are usually concerned with stochastic elds that have
PDFs where (at least) the rst two moments (the mean
and variance) are well dened and nite. L vy statistics is
e
concerned with statistical systems where all the moments
(starting with the mean) are innite.
Many distributions exist where the mean and variance are
nite but are not representative of the process, e.g. the tail of
the distribution is signicant, where rare but extreme events
occur. These distributions include L vy distributions. L vys
e
e
original approach5 to deriving such distributions is based on
the following question: Under what circumstances does the
distribution associated with a random walk of a few steps
look the same as the distribution after many steps (except
for scaling)? This question is effectively the same as asking
under what circumstances do we obtain a random walk that
is statistically self-afne. The characteristic function (i.e. the
Fourier transform) P (k) of such a distribution p(x) was rst
shown by L vy to be given by (for symmetric distributions
e
only)
P (k) = exp(a | k |q ), 0 < q 2
where a is a (positive) constant. If q = 0,
p(x) =
1
2
exp(a) exp(ikx)dk = exp(a)(x)
and the distribution is concentrated solely at the origin as

described by the delta function (x). When q = 1, the Cauchy
distribution
1
p(x) =
2
1
a
exp(a | k |) exp(ikx)dk =
2 + x2
a
between 0 and 2, L vys characteristic function corresponds

e
to a PDF of the form
1
p(x) 1+q , x .
x
This can be shown as follows6 : For 0 < q < 1 and since the
characteristic function is symmetric, we have
p(x) = Re[f (x)]
where
1
f (x) =
1 ikx kq
1
=
e e
ix
ix
k=0
eikx (qk q1 ek )dk

0
q
dkH(k)k q1 ek eikx , x
where
1,
0,
H(k) =
k>0
k<0
For 0 < q < 1, f (x) is singular at k = 0 and the greatest

contribution to this integral is the inverse Fourier transform of
H(k)k q1 . Noting that [27]
F 1
1
(ik)q
1
x1q
where F 1 denotes the inverse Fourier transform, and that

i
(x), x
x
then, using the convolution theorem, we have
H(k) (x) +
f (x)
and thus
p(x)
q i1q
ix xq
1
x1+q
, x
For 1 < q < 2, we can integrate by parts twice to obtain
is obtained and when q = 2, p(x) is characterized by the

Gaussian distribution
f (x) =
1
2
1
2
exp[x2 /(4a)],
a
whose rst and second moments are nite. The Cauchy distribution has a relatively long tail compared with the Gaussian
distribution and a stochastic eld described by a Cauchy
distribution is likely to have more extreme variations when
compared to a Gaussian distributed eld. For values of q
5 P L vy was the research supervisor of B Mandelbrot, the inventor of
e
fractal geometry.
q
ix
dkk q1 ek eikx
0
exp(ak 2 ) exp(ikx)dk
q
=
2ix
p(x) =
eikx ek dk
q
1 q1 kq ikx
k e
e
ix ix
k=0
q
x2
dkeikx [(q 1)k q2 ek q(k q1 )2 ek ]

0
q
x2
dkeikx [(q1)k q2 ek q(k q1 )2 ek ], x .

0
6 The author acknowledges Dr K I Hopcraft, School of Mathematical

Sciences, Nottingham University, England, for his help in deriving this result.

100
The rst term of this result is singular and therefore provides

the greatest contribution and thus we can write,
f (x)
q(q 1)
2x2
H(k)eikx (k q2 ek )dk.
In this case, for 1 < q < 2, the greatest contribution to this

integral is the inverse Fourier transform of k q2 and hence,
f (x)
q(q 1) i2q
x2 xq1
so that
1
, x
x1+q
which maps onto the previous asymptotic as q 1 from the
above.
For q 2, the second moment of the L vy distribution
e
exists and the sums of large numbers of independent trials are
Gaussian distributed. For example, if the result were a random
walk with a step length distribution governed by p(x), q > 2,
then the result would be normal (Gaussian) diffusion, i.e. a
Brownian process. For q < 2 the second moment of this PDF
(the mean square), diverges and the characteristic scale of the
walk is lost. This type of random walk is called a Le y ight.
v
p(x)
IX. T HE F RACTIONAL D IFFUSION E QUATION

We can consider a Hurst process to be a form of fractional
Brownian motion based on the generalization
A(t) = atH , H (0, 1].
Given that incoherent random walks describe processes whose
macroscopic behaviour is characterised by the diffusion equation, then, by induction, Hurst processes should be characterised by generalizing the diffusion operator
on) and that, in comparison, our approach to introducing a

fractional differential operator is based on postulation alone.
It is therefore similar to certain other differential operators, a
notable example being Schr dingers operator.
o
The fractional diffusion operator given above is appropriate
for modelling fractional diffusive processes that are stationary.
For non-stationary fractional diffusion, we could consider the
case where the diffusivity is time variant as dened by the
function (t). However, a more interesting case arises when
the characteristics of the diffusion processes change over time
becoming less or more diffusive. This is illustrated in terms
of the random walk in the plane given in Figure 7. Here, the
walk starts off being fully diffusive (i.e. H = 0.5 and q = 1),
changes to being fractionally diffusive (0.5 < H < 1 and
1 < q < 2) and then changes back to being fully diffusive. The
result given in Figure 7 shows a transition from two episodes
that are fully diffusive which has been generated using uniform
phase distributions whose width changes from 2 to 1.8 and
back to 2. In terms of fractional diffusion, this is equivalent
to having an operator
q
2
q q
2
x
t
where q = 1, t (0, T1 ]; q > 1, t (T1 , T2 ]; q = 1, t
(T2 , T3 ] where T3 > T2 > T1 . If we want to generalise
such processes over arbitrary periods of time, then we should
consider q to be a function of time. We can then introduce a
non-stationary fractional diffusion operator given by
2
q(t)
q(t) q(t) .
2
x
t
This operator is the theoretical basis for the Fractal Market
Hypothesis considered in this paper.
2
x
t
to the fractional form
2
q
q q
2
x
t
where q (0, 2] and D = 1/ is the fractional diffusivity.
Fractional diffusive processes can therefore be interpreted
as intermediate between classical diffusive (random phase
walks with H = 0.5; diffusive processes with q = 1) and
propagative process (coherent phase walks for H = 1;
propagative processes with q = 2), e.g. [39], [40] and [38]
- references therein. Fractional diffusion equations can also
be used to model L vy distributions [41] and fractal time
e
random walks [42], [43]. However, it should be noted that the
fractional diffusion operator given above is the result of a phenomenology. It is no more (and no less) than a generalisation
of a well known differential operator to fractional form which
follows from a physical analysis of a fully incoherent random
process and it generalisation to fractional form in terms of the
Hurst exponent. Note that the diffusion and wave equations can
be derived rigorously from a range of fundamental physical
laws (conservation of mass, the continuity equation, Fouriers
law of thermal conduction, Newtons laws of motion and so
Fig. 7.
Non-stationary random phase walk in the plane.
X. F RACTIONAL DYNAMIC M ODEL

We consider an inhomogeneous non-stationary fractional
diffusion equation of the form
2
q(t)
q(t) q(t) u(x, t) = F (x, t)
x2
t
where F is a stochastic source term with some PDF and u
is the stochastic eld whose solution we require. Specifying

101
q to be in the range 0 q 2, leads to control over the

basic physical characteristics of the equation so that we can
dene an anti-persistent eld u(x, t) when q < 1, a diffusive
eld when q = 1 and a propagative eld when q = 2. In this
case, non-stationarity is introduced through the use of a time
varying fractional derivative whose values modify the physical
characteristics of the equation.
The range of values of q is based on deriving an equation
that is a generalisation of both diffusive and propagative
processes using, what is fundamentally, a phenomenology.
When q = 0 t, the time dependent behaviour is determined
by the source function alone; when q = 1 t, u describes
a diffusive process where D = 1 is the diffusivity; when
q = 2 we have a propagative process where is the slowness
(the inverse of the wave speed). The latter process should
be expected to propagate information more rapidly than a
diffusive process leading to transients or ights of some type.
We refer to q as the Fourier Dimension which is related to
the Hurst Exponent by q = H + DT /2 where DT is the
Topological Dimension and to the Fractal Dimension DF by
q = 1 DF + 3DT /2 as shown in Appendix I.
Since q(t) drives the non-stationary behaviour of u, the
way in which we model q(t) is crucial. It is arguable that
the changes in the statistical characteristics of u which lead
to its non-stationary behaviour should also be random. Thus,
suppose that we let the Fourier dimension at a time t be
chosen randomly, a randomness that is determined by some
PDF. In this case, the non-stationary characteristics of u will
be determined by the PDF (and associated parameters) alone.
Also, since q is a dimension, we can consider our model to be
based on the statistics of dimension. There are a variety of
PDFs that can be applied which will in turn affect the range of
q. By varying the exact nature of the distribution considered,
we can drive the non-stationary behaviour of u in different
ways. However, in order to apply different statistical models
for the Fourier dimension, the range of q can not be restricted
to any particular range, especially in the case of a normal
distribution. We therefore generalize further and consider the
equation
2
poses a fundamental problem which is how to dene and work

with the term
q(t)
u(x, t).
tq(t)
Given the result (for constant q)
which allows us to apply different PDFs for q covering

arbitrary ranges. For example, suppose we consider a system
which is assumed to be primarily diffusive; then a normal
PDF of the type
1
Pr[q(t)] = exp[(q 1)2 /2 2 ], < q <
2
where is the standard deviation, will ensure that u is entirely
diffusive when 0. However, as is increased in value,
the likelihood of q = 2 (and q = 0) becomes larger. In
other words, the standard deviation provides control over the
likelihood of the process becoming propagative.
Irrespective of the type of distribution that is considered,
the equation
2
q(t)
q(t) q(t) u(x, t) = F (x, t)

2
x
t
(i)q U (x, ) exp(it)d
we might generalize as follows:
1
q( )
u(x, t) =
q( )
2
t
(i)q( ) U (x, ) exp(it)d.
However, if we consider the case where the Fourier dimension

is a relatively slowly varying function of time, then we can
legitimately consider q(t) to be composed of a sequence of
different states qi = q(ti ). This approach allows us to develop
a stationary solution for a xed q over a xed period of time.
Non-stationary behaviour can then be introduced by using the
same solution for different values of q over xed (or varying)
periods of time and concatenating the solutions for all q.
XI. G REEN S F UNCTION S OLUTION
We consider a Greens function solution to the equation
q
2
q q
x2
t
u(x, t) = F (x, t), < q <
when F (x, t) = f (x)n(t) where f (x) and n(t) are both

stochastic functions. Applying a separation of variables here
is not strictly necessary. However, it yields a solution in
which the terms affecting the temporal behaviour of u(x, t)
are clearly identiable. Thus, we require a general solution to
the equation
2
q
q q
x2
t
q(t)
q(t) q(t) u(x, t) = F (x, t), < q(t) < , t.

x2
t
q
1
u(x, t) =
tq
2
Let
u(x, t) = f (x)n(t).
1
u(x, t) =
2
U (x, ) exp(it)d
and
1
n(t) =
2
N () exp(it)d.
Then, using the result

q
1
u(x, t) =
tq
2
U (x, )(i)q exp(it)d
we can transform the fractional diffusion equation to the form

2
+ 2 U (x, ) = f (x)N ()
q
x2

102
where we shall take
B. Diffusion Equation Solution
When q = 1 and 1 = i i,
q = i(i) 2
1
u(x0 , t) =
2
and ignore the case for q = i(i) 2 . Dening the Greens

function g to be the solution of [44], [45]
dxf (x)...
2
+ 2 g(x | x0 , ) = (x x0 )
q
x2
1
2
where is the delta function, we obtain the following solution:
g(x | x0 , )f (x)dx
U (x0 , ) = N ()
(3)
For p = i, we can write this result in terms of a Bromwich

integral (i.e. an inverse Laplace transform) and using the
convolution theorem for Laplace transforms with the result
c+i
where [36]
g(x | x0 , k) =
i
exp(iq | x x0 |)
2q
under the assumption that u and u/x 0 as x .

This result reduces to conventional solutions for cases when
q = 1 (diffusion equation) and q = 2 (wave equation) as shall
now be shown.
exp( i | x x0 |)
N () exp(it)d.
i
exp(a p)
1
exp(pt)dp = exp[a2 /(4t)],
p
t
ci
we obtain
u(x0 , t) =
dxf (x)
A. Wave Equation Solution

When q = 2, the Greens function dened above provides a
solution for the outgoing Greens function. Thus, with 2 =
, we have
Now, if for example, we consider the case when n is a delta

function, the result reduces to
u(x0 , t) =
N ()
U (x0 , ) =
2i
exp(i | x x0 |)f (x)dx.
Fourier inverting and using the convolution theorem for the

Fourier transform, we get
1
u(x0 , t) =
2
N ()
exp(i | x x0 |) exp(it)d
i
1
2
dxf (x)
2 t
f (x) exp[(x0 x)2 /(4t)]dx, t > 0
which describes classical diffusion in terms of the convolution

of an initial source f (x) (introduced at time t = 0) with a
Gaussian function.
dxf (x)...
1
2
exp[(x0 x)2 /(4t0 )]
n(t t0 )dt0 .
t0
C. General Series Solution

The evaluation of u(x0 , t) via direct Fourier inversion for
arbitrary values of q is not possible due to the irrational
nature of the exponential function exp(iq | x x0 |) with
respect to . To obtain a general solution, we use the series
representation of the exponential function and write
n(t | x x0 |)dt
which describes the propagation of a wave travelling at velocity 1/ subject to variations in space and time as dened by
f (x) and n(t) respectively. For example, when f and n are
both delta functions,
1
u(x0 , t) =
H(t | x x0 |).
2
This is a dAlembertian type solution to the wave equation
where the wavefront occurs at t = | x x0 | in the causal
case.
U (x0 , ) =
iM0 N ()
(iq )m Mm (x0 )
1+
2q
m!
M0
m=1
(4)
where
f (x) | x x0 |m dx.
Mm (x0 ) =
We can now Fourier invert term by term to develop a series

solution. Given that we consider < q < , this requires
us to consider three distinct cases.

103
1) Solution for q = 0: Evaluation of u(x0 , t) in this case

is trivial since, from equation (3)
U (x0 , ) =
M (x0 )
M (x0 )
N () or u(x0 , t) =
n(t)
2
2
where
the third term is an innite series composed of fractional

differentials of increasing order kq/2. Also note that the rst
term is scaled by a factor involving q/2 whereas the third
term is scaled by a factor that includes kq/2 .
3) Solution for q < 0: In this case, the rst term becomes
1
2
exp( | x x0 |)f (x)dx.
M (x0 ) =
2) Solution for q > 0: Fourier inverting, the rst term in

equation (4) becomes
M0 q 1
2
2
2
iN ()M0
exp(it)d =
2q
M0 1
q
2 2 2
iM2 1
2.2! 2
N ()
q exp(it)d
(i) 2
1
M0
=
q
q
2 (2i)
2
1q
2
q
2
n()
d.
(t )1(q/2)
u(x0 , t) =
M1
n(t).
N () exp(it)d =
2
1
+
2
The third term is
n()
d.
(t )1(q/2)
k=1
(1)k+1 Mk+1 (x0 )

1
(k + 1)! kq/2 (2i)kq
1kq
2
kq
2
...
M2 2 d 2
N ()i(i) exp(it)d =
q n(t)
2.2! dt 2
n()
d
(t )1(kq/2)
N ()i2 (i)q exp(it)d =
M3 q d q
n(t)
2.3! dtq
and
M4 1
i
2.4! 2
1q
2
q
2
M0 (x0 ) q/2 dq/2

M1 (x0 )
n(t)
n(t)
2
2
dtq/2
q
2
and the fourth and fth terms become
3q
N ()i (i)
3q
2
3q
M4 2 d 2
exp(it)d =
n(t)
2.4! dt 3q
2
respectively with similar results for all other terms. Thus,

through induction, we can write u(x0 , t) as a series of the
form
u(x0 , t) =
M0 (x0 )
1
q/2 (2i)q
2
M2 1
1
=
q/2 (2i)q
2.2!
Evaluating the other terms, by induction we obtain
M1 1
2 2
M3 1
2.3! 2
N ()i
q exp(it)d
(i) 2
The second term is
M0 q d 2
2 q n(t).
2
dt 2
The second term is the same is in the previous case (for q > 0)
and the third term is
iM2 1
2.2! 2
N ()(i) 2 exp(it)d =
1
2
iN ()M0
exp(it)d
2q
M1 (x0 )
1
n(t) +
2
2
k=1
1q
2
q
2
n()
d
(t )1(q/2)
k+1
(1)
dkq/2
Mk+1 (x0 ) kq/2 kq/2 n(t).
(k + 1)!
dt
Observe that the rst term involves a fractional integral (the

Riemann-Liouville integral), the second term is composed
of the source function n(t) alone (apart from scaling) and
where q | q |, q < 0. Here, the solution is composed of

three terms: a fractional differential, the source term and an
innite series of fractional integrals of order kq/2. Thus, the
roles of fractional differentiation and fractional integration are
reversed as q changes from being greater than to less than
zero. All fractional differential operators associated with the
equations above and hence forth should be considered in terms
of the denition for a fractional differential given by
dn
Dq f (t) = n [I nq f (t)], n q > 0

dt
where I is the fractional integral operator (the RiemannLiouville transform),

t
I p f (t) =
1
(p)
f ()
d, p > 0
(t )1p
(5)
The reason for this is that direct fractional differentiation

can lead to divergent integrals. However, there is a deeper
interpretation of this result that has a synergy with the issue
over whether a macroeconomic system has memory and
is based on observing that the evaluation of a fractional
differential operator depends on the history of the function in
question. Thus, unlike an integer differential operator of order
n, a fractional differential operator of order q has memory

104
because the value of I qn f (t) at a time t depends on the

behaviour of f (t) from to t via the convolution with
t(nq)1 /(n q). The convolution process is of course
dependent on the history of a function f (t) for a given kernel
and thus, in this context, we can consider a fractional derivative
dened via the result above to have memory. In this sense, the
operator
q(t)
2
q(t) q(t)
2
x
t
decribes a process, compounded in a eld u(x, t), that has a
non-stationary memory association with the temporal characteristics of the system it is attempting to model. This is not
an intrinsic charcteristic of systems that are purely diffusive
q = 1 or propagative q = 2.
D. Asymptotic Solutions for an Impulse
We consider a special case in which the source function
f (x) is an impulse so that
q-value
t-space
-space (PSDF)
Name
q=0
1
t
White noise
q=1
1
||
Pink noise
1
2
Brown noise
1
||q
Black noise
t
R
q=2
TABLE I
N OISE CHARACTERISTICS FOR DIFFERENT VALUES OF q. N OTE THAT THE
RESULTS GIVEN ABOVE IGNORE SCALING FACTORS .
Note that q = 0 denes the Hilbert transform of n(t) whose

spectral properties in the positive half space are identical to
n(t) because
(6)
x0 0
( 1q )
1
2
1
q/2
2 (2i)q ( q )
n()
d,
(t)1(q/2)
where
sign() =
> 0;
< 0.
The statistical properties of the Hilbert transform of n(t) are

therefore the same as n(t) so that
q > 0;
Pr[t1 n(t)] = Pr[n(t)].

Hence, as q 0, the statistical properties of u(t) will reect
those of n, i.e.
q < 0.
The solution for the time variations of the stochastic eld u

for q > 0 are then given by a fractional integral alone and
for q < 0 by a fractional differential alone. In particular, for
q > 0, we see that the solution is based on the convolution
integral (ignoring scaling)
1
n(t), q > 0
t1q/2
where denotes convolution and in -space (ignoring scaling)
u(t) =
N ()
.
(i)q/2
This result is the conventional random fractal noise model for

Fourier dimension q. Table I quanties the results for different
values of q with conventional name associations7 . The eld u
has the following fundamental property for q (0, 2):
Pr
1
t1q/2
This property describes the statistical self-afnity of u. Thus,

the asymptotic solution considered here, yields a result that
describes a random scaling fractal eld characterized by a
PSDF of the form 1/ | |q which is a measure of the time
correlations in the signal.
that Brown noise conventionally refers to the integration of white
noise but that Brownian motion is a form of pink noise because it classies
diffusive processes identied by the case when q = 1.
n(t) = Pr[n(t)], q 0.
However, as q 2 we can expect the statistical properties of

u(t) to be such that the width of the PDF of u(t) is reduced.
This reects the greater level of coherence (persistence in time)
associated with the stochastic eld u(t) for q 2.
E. Other Asymptotic Solutions
A similar result to the asymptotic solution for x0 0 is
obtained when the diffusivity is large, i.e.
lim u(x0 , t)
M0 (x0 )
1
=
2 q/2 (2i)q
q/2 Pr[u(t)] = Pr[u(t)].
7 Note
1,
1,
q = 0;
n(t)
2 ,
q/2 q/2
d
2 dtq/2 n(t),
U () =
n(t)dt
1
n(t) isign()N ()
t
This result immediately suggests a study of the asymptotic

solution
u(t) = lim u(x0 , t)
n(t)
t(q/2)1 n(t)
q>2
(x) | x x0 |m dx =| x0 |m .
Mm (x0 ) =
n(t)
1q
2
q
2
n()
d
(t )1(q/2)
M1 (x0 )
n(t), q > 0.
(7)
2
Here, the solution is the sum of fractal noise and white noise.
Further, by relaxing the condition 0 we can consider the
approximation
u(x0 , t)
M0 (x0 )
1
q/2 (2i)q
2
1q
2
q
2
n()
d
(t )1(q/2)

105
M1 (x0 )
M2 (x0 ) q/2 dq/2
n(t), q > 0, << 1
n(t) +
2
2.2!
dtq/2
(8)
in which the solution is expressed in terms of the sum of
fractal noise, white noise and the fractional differentiation8 of
white noise.
F. Equivalence with a Wavelet Transform

The wavelet transform is dened in terms of projections of
f (t) onto a family of functions that are all normalized dilations
and translations of a prototype wavelet function w [47], i.e.
W[f (t)] = FL (t) =
f ( )wL (, t)d
where
t
1
, L > 0.
wL (, t) = w
L
L
The independent variables L and t are continuous dilation and
translation parameters respectively. The wavelet transformation is essentially a convolution transform where wL (t) is the
convolution kernel with dilation variable L. The introduction
of this factor provides dilation and translation properties into
the convolution integral that gives it the ability to analyse
signals in a multi-resolution role (the convolution integral is
now a function of L), i.e.
FL (t) = wL (t) f (t), L > 0.
In this sense, the asymptotic solution (ignoring scaling)
1
u(t) = 1q/2 n(t), q > 0 x 0
t
is compatible with the case of a wavelet transform where
1
w1 (t) = 1q/2
t
for the stationary case and where, for the non-stationary case,
1
w1 (t, ) = 1q( )/2 .
t
XII. FTSE A NALYSIS USING OLR
We consider the basic model for a nancial signal to be

given by
1
u(t) = 1q/2 n(t), q > 0
t
which has characteristic spectrum
U () =
N ()
(i)q/2
and is a solution to the fractional diffusion equation

2
q
q q
x2
t
dened by equation (5).
ln P () = C + q ln
where C = ln c. The problem is therefore reduced to implementing an appropriate method to compute q (and C) by
nding a best t of the line ln P () to the data ln P ().

Application of the least squares method for computing q,
which is based on minimizing the error
e(q, C) = ln P () ln P (, q, C)
2
2
with regard to q and C, leads to errors in the estimates for

q which are not compatible with market data analysis. The
reason for this is that relative errors at the start and end
of the data ln P may vary signicantly especially because
any errors inherent in the data P will be amplied through
application of the logarithmic transform required to linearise
the problem. In general, application of a least squares approach
is very sensitive to statistical heterogeneity [48] and in this
application, may provide values of q that are not compatible
with the rationale associated with the FMH (i.e. values of 1 <
q < 2 that are intermediate between diffusive and propagative
processes). For this reason, an alternative approach must be
considered which, in this paper, is based on Orthogonal Linear
Regression (OLR).
Applying a standard moving window, q(t) is computed by
repeated application of OLR based on the m-code available
from [49]. Since q is, in effect, a statistic, its computation
is only as good as the quantity (and quality) of data that
is available for its computation. For this reason, a relatively
large window is required whose length is compatible with:
(i) the number of samples available; (ii) the autocorrelation
function and long-term memory effects as discussed in Section
III. An example of the q(t) signal obtained using a 1000
element window is given in Figure 8 which includes q(t) after
it has been smoothed using a Gaussian low-pass ltered to
reveal the underlying trends in q. Inspection of the data (i.e.
closer inspection of the time series than is shown in Figure 8)
clearly illustrates a qualitative relationship between trends in
the nancial data and q(t) in accordance with the theoretical
model considered. In particular, over periods of time in which
q increases in value, the amplitude of the nancial signal
u(t) decreases. Moreover, and more importantly, an upward
trend in q appears to be a pre-cursur to a downward trend in
u(t). A more detailed example of this behviour is shown in
Figure 9 for close of day FTSE data over a smaller period of
time (i.e. from 1994 to 1997), a correlation that is compatible
with the idea that a rise in the value of q relates to the
system becoming more propagative, which in stock market
terms, indicates the likelihood for the markets becoming bear
dominant in the future.
u(x, t) = (x)n(t), x 0
The PSDF is thus characterised by q , 0 and our

problem is thus, to compute q from the data P () =| U () |2
, 0. For this data, we consider the PSDF
c
P () = q
8 As
or
The results of using the method discussed above not only

provides for a general appraisal of different macroeconomic
nancial time series, but, with regard to the size of selected
window used, an analysis of data at any point in time.
The output can be interpreted in terms of persistence and
106

XIII. D ISCUSSION
This paper is concerned with the introduction and theoretical
analysis (in terms of general a solution) associated with the
non-stationary fractional diffusion operator
Fig. 8. Application of OLR using a 1000 element window for analysing

nancial time series composed of FTSE values (close-of-day) from 02-041984 to 13-02-2008. The plot shows the time varying Fourier Dimension
q(t) (green) onto which is superimposed a Gaussian low-pass ltered version
of the signal (red) and the FTSE time series after normalisation.
Fig. 9. Application of OLR using a 1000 element window for analysing

nancial time series composed of FTSE values (close-of-day) from 05-041994 to 24-12-1997. The plot shows the time varying Fourier Dimension
q(t) (green) onto which is superimposed a Gaussian low-pass ltered version
of the signal (red) and the FTSE time series after normalisation.
anti-persistence and in terms of the existence or absence

of after-effects (macroeconomic memory effects). For those
periods in time when q(t) is relatively constant, the existing
market tendencies usually remain. Changes in the existing
trends tend to occur just after relatively sharp changes in
q(t) have developed. This behaviour indicates the possibility
of using the time series q(t) for identifying the behaviour
of a macroeconomic nancial system in terms of both intermarket and between-market analysis. These results support the
possibility of using q(t) as an independent macroeconomic
volatility predictor. It is noted that, at the time of writing
this paper, the value of q(t) associated with those days after
approximately day 4800 in Figure 8 (representing the latter
half of 2007) indicate the growth of propagative behaviour and
thus the macroeconomic instability compounded in the term
Credit Crunch. This is not surprising if it is assumed that the
downward trend from approximately day 3000 to day 3700
shown in Figure 8 is a natural consequence of the effect of a
higher inationary global economy resulting from the end of
the cold war and that the upward trend from approximately day
3700 to 5000 is a consequence of credit policies adopted by
banks in an attempt to compensate for this natural inationary
pressure. Under this assumption, the Credit Crunch of 2007
represents a transition that is compounded in a reappraisal of
the denition of poverty, namely, that poverty is not a measure
of how little one has but a measure of how much one owes.
q(t)
2
q(t) q(t)
x2
t
in the context of a macroeconomic model. By considering a
source function of the type (x)n(t) where n(t) is white noise,
we have shown that, for x 0, the fractional diffusive eld
u(t) at time is given by (ignoring scaling)
u(t, ) =
1
t1q( )/2
n(t)
which has Power Spectral Density Function characterised by

| |q( )/2 - a random scaling fractal. It should be noted,
that the data analysis reported in this paper is based on an
asymptotic solution (i.e. x 0) used to obtain equation (6)
and is thus, limited in the extent to which it reects the
physical principles upon which the model has been established.
However, it is noted that the computation of q(t) in the
presence of additive white noise is equivalent to the inversion
of equation (7) for q (and for arbitrary values of x0 ) when
0. In this sense, the power spectrum method used to
compute q(t) is valid under the assumption that a fractional
diffusive process occurs with high diffusivity and a high
signal-to-noise ratio (i.e. M1 (x0 ) 0). For the case when
<< 1, the inversion of equation (8) to compute q from u
might be possible using an iterative approach which can be
extended to solve the general case as required.
The non-stationary nature of this model is taken to account for stochastic processes that can vary in time and are
intermediate between diffusive and propagative or persistent
behaviour. Application of Orthogonal Linear Regression to
macroeconomic time series data provides an accurate and
robust method to compute q(t) when compared to other statistical estimation techniques such as the least squares method.
As a result of the physical interpretation associated with the
fractional diffusion equation and the meaning of q(t), we
can, in principal, use the signal q(t) as a predictive measure in
the sense that as the value of q(t) continues to increases, there
is a greater likelihood for volatile behaviour of the markets.
This is reected in the data analysis that is compounded in
Figure 8 for the FTSE close-of-day between 1980 to 2007
and in other nancial data, the results of which lie beyond the
scope of this paper 9 . It should be noted that because nancial
time series data is assumed to be self-afne, the computation
of q(t) can be applied over any time scale, and that the FTSE
close-of-day is only one example that has been used in this
paper as an illustrative case study.
In a statistical sense, q(t) is just another measure that may,
or otherwise, be of value to market traders. In comparison
with other statistical measures, this can only be assessed
through its practical application in a live trading environment.
However, in terms of its relationship to a stochastic model
for macroeconomic data, q(t) does provide a measure that
9 Similar
results being observed for other major stock markets.
107

is consistent with the physical principles associated with a

random walk that includes a directional bias, i.e. fractional
Brownian motion. The model considered, and the signal
processing algorithm proposed, has a close association with
re-scaled range analysis for computing the Hurst exponent H
since for DT = 1, q = H + 1/2 (see Appendix I) [48]. In
this sense, the principal contribution of this paper has been to
consider a model that is quantied in terms of a physically
signicant (but phenomenological) model that is compounded
in a specic (fractional) partial differential equation. As with
other nancial time series, their derivatives, transforms etc., a
range of statistical measures can be used to characterise q(t),
an example being given in Figure 8 and Figure 9 where q(t)
has been smoothed to provide a measure of the underlying
trends.
In terms of the non-stationary fractional diffusive model
considered in this work, the time varying Fourier dimension
q(t) can be interpreted in terms of a gauge on the characteristics of a dynamical system. This includes the management processes from which all modern economies may be
assumed to be derived. In this sense, the FMH is based on
three principal considerations: (i) the non-stationary behaviour
associated with any system undergoing continuous change that
is driven by a management infrastructure; (ii) the cause and
effect that is inherent at all scales (i.e. all levels of management
hierarchy); (iii) the self-afne nature of outcomes relating to
points (i) and (ii). In a modern economy, the principal issue
associated with any form of nancial management is based on
the ow of information and the assessment of this information
at different points connecting a large network. In this sense,
a macroeconomy can be assessed in terms of its information
network which consists of a distribution of nodes from which
information can ow in and out. The efciency of the system
is determined by the level of randomness associated with the
direction of ow of information to and from each node. The
nodes of the system are taken to be individuals or small
groups of individuals whose assessment of the information
they acquire together with their remit, responsibilities and
initiative, determines the direction of the information ow
from one node to the next. The determination of the efciency
of a system in terms of randomness is the most critical in terms
of the model developed. It suggests that the performance of
a business is related to how well information ows through
an organisation. If the information ow is entirely random,
then we might surmise that the decisions made which drive
the direction of the system are also entirely random. The
principal point here is that the ow of information has a direct
relationship on the management decisions that are made on
behalf of an organisation.
The non-stationary but statistically self-afne nature of the
markets leads directly to the use of the Fourier dimension as
a measure for quantifying their state of coherence. Just as
this parameter can be used as a market index for managing a
nancial portfolio, so, it may be of value in quantifying the
state of any organisation undergoing change (management).
The conceptual basis associated with the Fourier dimension
and the system behaviour that it reects leads directly to an
approach to management where the principles of openness and
Fractal type
Fractal Dust
Fractal Curve
Fractal Surface
Fractal Volume
Fractal Time
Hyper-fractals
.
.
.
Fractal Dimension
0 < DF < 1
1 < DF < 2
2 < DF < 3
3 < DF < 4
4 < DF < 5
5 < DF < 6
.
.
.
TABLE II
F RACTAL TYPES AND CORRESPONDING
FRACTAL DIMENSIONS
transparency articulate the degree of coherence of information

ow through an organisation from one level to another. In
effect, the sustained organisational approach to managing
continuous change is the basis for a portfolio in which q(t) > 1
and increases with time.
The FMH and the self-afne nature of organisations in
general provides a model in which the work-force at any one
level (i.e. department/section/group etc.) of an organisation
can empathise with all other levels by cultivating an understanding in which each level is a reection of their own, e.g.
problems/solutions at middle management are a reection of
the same type of problems/solutions at executive level. This
empathy is a two-way entity which differs only in terms
of its scale. Sustained organisational change and the example
methods of implementing it is a self-afne process and should
thus be introduced with this aspect in mind [50]. In tackling
problems at any level within an organisation, one is, in effect,
taking consideration of such problems above and below that
same level in terms of the dynamic behaviour of the system
as a whole, a macroeconomy being the antithesis of such a
system.
A PPENDIX I
R ELATIONSHIP BETWEEN THE H URST E XPONENT AND THE
T OPOLOGICAL , F RACTAL AND F OURIER D IMENSIONS
Suppose we cut up some simple one-, two- and threedimensional Euclidean objects (a line, a square surface and
a cube, for example), make exact copies of them and then
keep on repeating the copying process. Let N be the number
of copies that we make at each stage and let r be the length
of each of the copies, i.e. the scaling ratio. Then we have
N rDT = 1, DT = 1, 2, 3, ...
where DT is the topological dimension. The similarity or
fractal dimension is that value of DF which is usually (but not
always) a non-integer dimension greater that its topological
dimension (i.e. 0,1,2,3,... where 0 is the dimension of a point
on a line) and is given by
DF =
log(N )
.
log(r)
The fractal dimension is that value that is strictly greater

than the topological dimension as given in Table II. In each
case, as the value of the fractal dimension increases, the fractal
becomes increasingly space-lling in terms of the topological
dimension which the fractal dimension is approaching. In each

108
case, the fractal exhibits structures that are self-similar. A

self-similar deterministic fractal is one where a change in the
scale of a function f (x) (which may be a multi-dimensional
function) by a scaling factor produces a smaller version,
reduced in size by , i.e.
f (x) = f (x).
A self-afne deterministic fractal is one where a change in
the scale of a function f (x) by a factor produces a smaller
version reduced in size by a factor q , q > 0, i.e.
f (x) = q f (x).
For stochastic elds, the expression
Pr[f (x)] = q Pr[f (x)]
describes a statistically self-afne eld - a random scaling
fractal. As we zoom into the fractal, the shape changes, but
the distribution of lengths remains the same.
There is no unique method for computing the fractal dimension. The methods available are broadly categorized into
two families: (i) Size-measure relationships, based on recursive
length or area measurements of a curve or surface using
different measuring scales; (ii) application of relationships
based on approximating or tting a curve or surface to a known
fractal function or statistical property, such as the variance.
Consider a simple Euclidean straight line of length L( )
over which we walk a shorter ruler of length . The number
of steps taken to cover the line N [L( ), ] is then L/ which
is not always an integer for arbitrary L and . Since
N [L( ), ] =
L( )
= L( ) 1 ,
ln L( ) ln N [L( ), ]
=
1=
ln
ln N [L( ), ] ln L( )
ln
which expresses the topological dimension DT = 1 of the

line. In this case, L( ) is the Lebesgue measure of the line
and if we normalize by setting L( ) = 1, the latter equation
can then be written as
1 = lim
ln N ()
ln
since there is less error in counting N () as becomes smaller.

We also then have N () = 1 . For extension to a fractal
curve f , the essential point is that the fractal dimension should
satisfy an equation of the form
N [F (f ), ] = F (f ) DF
where N [F (f ), ] is read as the number of rulers of size
needed to cover a fractal set f whose measure is F (f ) which
can be any valid suitable measure of the curve. Again we may
normalize, which amounts to dening a new measure F as
some constant multiplied by the old measure to get
DF = lim
ln N ()
ln
where N () is taken to be N [F (f ), ] for notational convenience. Thus a piecewise continuous eld has precise fractal
properties over all scales. However, for the discrete (sampled)

eld
ln N ()
D=
ln
where we choose values 1 and 2 (i.e. the upper and lower
bounds) satisfying 1 < < 2 over which we apply
an averaging processes denoted by
. The most common
approach is to utilise a bi-logarithmic plot of ln N () against
ln , choose values 1 and 2 over which the plot is uniform
and apply an appropriate data tting algorithm (e.g. a least
squares estimation method or, as used in this paper, Orthogonal
Linear Regression) within these limits.
The relationship between the Fourier dimension q and the
fractal dimension DF can be determined by considering this
method for analysing a statistically self-afne eld. For a
fractional Brownian process (with unit step length)
A(t) = tH , H (0, 1]
where H is the Hurst dimension. Consider a fractal curve
covering a time period t = 1 which is divided up into N =
1/t equal intervals. The amplitude increments A are then
given by
1
A = tH = H = N H .
N
The number of lengths = N 1 required to cover each
interval is
N H
At = 1 = N 1H
N
so that
N () = N N 1H = N 2H .
Now, since
N () =
1
, 0,
DF
then, by inspection,
DF = 2 H.
Thus, a Brownian process, where H = 1/2, has a fractal
dimension of 1.5. For higher topological dimensions DT
DF = DT + 1 H.
This algebraic equation provides the relationship between the
fractal dimension DF , the topological dimension DT and the
Hurst dimension H. We can now determine the relationship
between the Fourier dimension q and the fractal dimension
DF .
Consider a fractal signal f (x) over an innite support with
a nite sample fX (x), given by
fX (x) =
f (x), 0 < x < X;

0,
otherwise.
A nite sample is essential as otherwise the power spectrum

diverges. Moreover, if f (x) is a random function then for any
experiment or computer simulation we must necessarily take
a nite sample. Let FX (k) be the Fourier transform of fX (x),
PX (k) be the power spectrum and P (k) be the power spectrum
of f (x). Then

109
Thus, since
1
fX (x) =
2
DF = DT + 1 H,
FX (k) exp(ikx)dk,
then = 5 2DF for a fractal signal and = 8 2DF for

a fractal surface so that, in general,
1
2
PX (k) =
|FX (k)|
X
and
= 2(DT + 1 DF ) + DT = 3DT + 2 2DF
P (k) = lim PX (k).

X
and
The power spectrum gives an expression for the power of a

signal for particular harmonics. P (k)dk gives the power in
the range k to k + dk. Consider a function g(x), obtained
from f (x) by scaling the x-coordinate by some a > 0, the f coordinate by 1/aH and then taking a nite sample as before,
i.e.
g(x) =
0,
gX (x) =
1
f (ax),
aH
0 < x < X;
otherwise.
Let GX (k) and PX (k) be the Fourier transform and power

spectrum of gX (x), respectively. We then obtain an expression
for GX in terms of FX ,
X
GX (k) =
gX (x) exp(ikx)dx =
DT
3DT + 2
=
,
2
2
the Fourier dimension being given by q = /2.
DF = DT + 1 H = DT + 1
ACKNOWLEDGMENT
Some of the material presented in this paper is based on
the PhD Theses of two former research students of the author,
Dr Mark London and Dr Irena Lvova. The author is grateful
to Mr Bruce Murray (Lexicon Data Limited) for his help
in the assessment of the nancial data analysis carried out
by the author and to Ms Mariam Fardoost (Merrill Lynch,
London) for her advice with regard to global economics, macro
overviews and market forecasts.
0
X
1
aH+1
f (s) exp
0
R EFERENCES
iks
ds
a
[1]
[2]
[3]
[4]
[5]
where s = ax. Hence

GX (k) =
1
FX
aH+1
k
a
[6]
and the power spectrum of gX (x) is

PX (k) =
1
1
FX
a2H+1 aX
k
a
[7]
[8]
[9]
and, as X ,
[10]
P (k) =
P
a2H+1
k
a
[11]
[12]
Since g(x) is a scaled version of f (x), their power spectra are

equal, and so
P (k) = P (k) =
1
P
a2H+1
k
a
[13]
[14]
[15]
.
[16]
If we now set k = 1 and then replace 1/a by k we get

1
.
k
Now since = 2H + 1 and DF = 2 H, we have
P (k)
k 2H+1
1
5
=
.
2
2
The fractal dimension of a fractal signal can be calculated
directly from using the above relationship. This method also
generalizes to higher topological dimensions giving
DF = 2
= 2H + DT .
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
http://www.tickdata.com/
http://www.vhayu.com/
http://en.wikipedia.org/wiki/Louis Bachelier
http://en.wikipedia.org/wiki/Robert Brown (botanist)
T. R. Copeland, J. F. Weston and K. Shastri, Financial Theory and
Corporate Policy, 4th Edition, Pearson Addison Wesley, 2003.
J. D. Martin, S. H. Cox, R. F. McMinn and R. D. Maminn, The Theory of
Finance: Evidence and Applications, International Thomson Publishing,
1997.
R. C. Menton, Continuous-Time Finance, Blackwell Publishers, 1992.
T. J. Watsham and K. Parramore, Quantitative Methods in Finance,
Thomson Business Press, 1996.
E. Fama, The Behavior of Stock Market Prices, Journal of Business Vol.
38, 34-105, 1965.
P. Samuelson, Proof That Properly Anticipated Prices Fluctuate Randomly, Industrial Management Review Vol. 6, 41-49, 1965.
E. Fama, Efcient Capital Markets: A Review of Theory and Empirical
Work, Journal of Finance Vol. 25, 383-417, 1970.
G. M. Burton, Efcient Market Hypothesis, The New Palgrave: A
Dictionary of Economics, Vol. 2, 120-23, 1987.
F. Black and M. Scholes, The Pricing of Options and Corporate
Liabilities, Journal of Political Economy, Vol. 81(3), 637-659, 1973.
http://uk.nance.yahoo.com/q/hp?s=%5EFTSE
B. B. Mandelbrot and J. R. Wallis, Robustness of the Rescaled Range
R/S in the Measurement of Noncyclic Long Run Statistical Dependence,
Water Resources Research, Vol. 5(5), 967-988, 1969.
B. B. Mandelbrot, Statistical Methodology for Non-periodic Cycles:
From the Covariance to R/S Analysis, Annals of Economic and Social
Measurement, Vol. 1(3), 259-290, 1972.
E. H. Hurst, A Short Account of the Nile Basin, Cairo, Government
Press, 1944.
http://en.wikipedia.org/wiki/Elliott wave principle
http://www.olivercromwell.org/jews.htm
B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, 1983.
J. Feder, Fractals, Plenum Press, 1988.
K. J. Falconer, Fractal Geometry, Wiley, 1990.
H. O. Peitgen, H. J rgens and D. Saupe D, Chaos and Fractals: New
u
Frontiers of Science, Springer, 1992.
P. Bak, How Nature Works, Oxford University Press, 1997.
M. J. Turner, J. M. Blackledge and P. Andrews, Fractal Geometry in
Digital Imaging, Academic Press, 1997.
N. Lam and L. De Cola L, Fractal in Geography, Prentice-Hall, 1993.
J. M. Blackledge, Digital Image Processing, Horwood, 2006.
110

[28] H. O. Peitgen and D. Saupe (Eds.), The Science of Fractal Images,

Springer, 1988.
[29] A. J. Lichtenberg and M. A. Lieberman, Regular and Stochastic Motion:
Applied Mathe-matical Sciences, Springer-Verlag, 1983.
[30] J. J. Murphy, Intermarket Technical Analysis: Trading Strategies for the
Global Stock, Bond, Commodity and Currency Market, Wiley Finance
Editions, Wiley, 1991.
[31] J. J. Murphy, Technical Analysis of the Futures Markets: A Comprehensive Guide to Trad-ing Methods and Applications, New York Institute
of Finance, Prentice-Hall, 1999.
[32] T. R. DeMark, The New Science of Technical Analysis, Wiley, 1994.
[33] J. O. Matthews, K. I. Hopcraft, E. Jakeman and G. B. Siviour, Accuracy
Analysis of Measurements on a Stable Power-law Distributed Series of
Events, J. Phys. A: Math. Gen. 39, 1396713982, 2006.
[34] W. H. Lee, K. I. Hopcraft, and E. Jakeman, Continuous and Discrete
Stable Processes, Phys. Rev. E 77, American Physical Society, 011109,
1-4.
[35] A. Einstein, On the Motion of Small Particles Suspended in Liquids at
Rest Required by the Molecular-Kinetic Theory of Heat, Annalen der
Physik, Vol. 17, 549-560, 1905.
[36] J. M. Blackledge, G. A. Evans and P. Yardley, Analytical Solutions to
Partial Differential Equations, Springer, 1999.
[37] H. Hurst, Long-term Storage Capacity of Reservoirs, Transactions of
American Society of Civil Engineers, Vol. 116, 770-808, 1951.
[38] M. F. Shlesinger, G. M. Zaslavsky and U. Frisch (Eds.), L vy Flights
e
and Related Topics in Physics, Springer 1994.
[39] R. Hilfer, Foundations of Fractional Dynamics, Fractals Vol. 3(3), 549556, 1995.
[40] A. Compte, Stochastic Foundations of Fractional Dynamics, Phys. Rev
E, Vol. 53(4), 4191-4193, 1996.
[41] T. F. Nonnenmacher, Fractional Integral and Differential Equations for
a Class of L vy-type Probability Densities, J. Phys. A: Math. Gen. Vol.
e
23, L697S-L700S, 1990.
[42] R. Hilfer, Exact Solutions for a Class of Fractal Time Random Walks,
Fractals, Vol. 3(1), 211-216, 1995.
[43] R. Hilfer and L. Anton, Fractional Master Equations and Fractal Time
Random Walks, Phys. Rev. E, Vol. 51(2), R848-R851, 1995.
[44] P. M. Morse and H. Feshbach, Methods of Theoretical Physics, McGrawHill, 1953.
[45] G. F. Roach, Greens Functions (Introductory Theory with Applications),
Van Nostrand Reihold, 1970.
[46] F. B. Tatom, The Application of Fractional Calculus to the Simulation of
Stochastic Processes, Engineering Analysis Inc., Huntsville, Alabama,
AIAA-89/0792, 1989.
[47] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, ISBN:
0-12-466606-X, 1999.
[48] I. Lvova, Application of Statistical Fractional Methods for the Analysis
of Time Series of Currency Exchange Rates, PhD Thesis, De Montfort
University, 2006.
[49] http://www.mathworks.com/matlabcentral/leexchange/
loadFile.do?objectId=6716&objectType=File
[50] C. Davies, Sustained Organisational Change: A Hearts and Minds
Approach, PhD Thesis, Loughborough University, 2007.
Jonathan Blackledge received a BSc in Physics

from Imperial College, London University in 1980,
a Diploma of Imperial College in Plasma Physics
in 1981 and a PhD in Theoretical Physics from
Kings College, London University in 1983. As a Research Fellow of Physics at Kings College (London
University) from 1984 to 1988, he specialized in
information systems engineering undertaking work
primarily for the defence industry. This was followed
by academic appointments at the Universities of
Craneld (Senior Lecturer in Applied Mathematics)
and De Montfort (Professor in Applied Mathematics and Computing) where
he established new post-graduate MSc/PhD programmes and research groups
in computer aided engineering and informatics. In 1994, he co-founded
Management and Personnel Services Limited where he is currently Executive
Director. His work for Microsharp (Director of R & D, 1998-2002) included
the development of manufacturing processes now being used for digital
information display units. In 2002, he co-founded a group of companies
specializing in information security and cryptology for the defence and
intelligence communities, actively creating partnerships between industry
and academia. He currently holds academic posts in the United Kingdom
and South Africa, and in 2007 was awarded Fellowships of the City and
Guilds London Institute and the Institute of Leadership and Management
together with Freedom of the City of London for his role in the development
of the Higher Level Qualication programmes in Engineering, ICT and
Business Administration, most recently, for the nuclear industry, security
and nancial sectors respectively. Professor Blackledge has published over
one hundred scientic and engineering research papers and technical reports
for industry, six industrial software systems, fteen patents, ten books and
has been supervisor to sixty research (PhD) graduates. He lectures widely
to a variety of audiences composed of mathematicians, computer scientists,
engineers and technologists in areas that include cryptology, communications
technology and the use of articial intelligence in process engineering,
nancial analysis and risk management. His current research interests include
computational geometry and computer graphics, image analysis, nonlinear
dynamical systems modelling and computer network security, working in both
an academic and commercial context. He holds Fellowships with Englands
leading scientic and engineering Institutes and Societies including the
Institute of Physics, the Institute of Mathematics and its Applications, the
Institution of Electrical Engineers, the Institution of Mechanical Engineers,
the British Computer Society, the Royal Statistical Society and the Institute
of Directors. He is a Chartered Physicist, Chartered Mathematician, Chartered
Electrical Engineer, Chartered Mechanical Engineer, Chartered Statistician
and a Chartered Information Technology Professional.
Regular Paper
111

Pr Hkanson et al.: An Ultra-Wideband Six-port I/Q Demodulator Covering
from 3.1 to 4.8 GHz
An Ultra-Wideband Six-port I/Q Demodulator

Covering from 3.1 to 4.8 GHz
Pr Hkansson, Duxiang Wang, Shaofang Gong
AbstractThis paper presents an ultra-wideband I/Q

demodulator based on the six-port technique. The six-port I/Q
demodulator covers the frequency spectrum from 3.1 to 4.8 GHz,
i.e., it covers the lower band of the UWB spectrum. The
demodulator has thus an relative bandwidth of 43%. This sixport circuit utilizes three ultra-wideband broadband 3-dB 90
branch couplers and one 3-dB 0 Wilkinson power divider. It is
manufactured utilizing microstrips on a printed circuit board.
Simulation and measurement results of this new six-port and the
traditional six-port correlators are compared. The designed sixport correlator shows good phase and amplitude balances. The
I/Q demodulator with this new correlator also shows good
demodulation results in the frequency range without any
calibration.
Index Terms directional coupler, broadband correlator,
quadrature hybrid, receivers, six-port receiver, UWB systems
I. INTRODUCTION
odern communication systems require high data-rate,
wide bandwidth, small size and low cost. A trend in

communications is towards reconfigurable radio terminals,
i.e., software defined radio (SDR). SDR requires receivers
with wideband capabilities to support as many different
services as possible in a very wide band of frequencies. The
homodyne topology offers advantages in reducing complicity
and cost of a radio receiver. Therefore, it is of particular
interest and much research effort has been done on homodyne
receivers. However, conventional homodyne receivers are
traditionally of narrowband. To overcome this problem, the
six-port receiver, with a wideband property, is a promising
architecture [1]. The six-port as a communication receiver was
first introduced in 1994 by Ji Li, R. G. Bosisio and Ke Wu [2].
The six-port receiver is a wideband solution but still with
limited bandwidth. Today, ultra-wideband (UWB) (>20%
relative bandwidth) is needed to archive high speed for short
range wireless-communication. There are two dominating
UWB solutions for high data rate and short range wireless
communication. One is based on the direct sequence spread
spectrum technique and the other is based on the multi-band
Manuscript received 2007-10-18. Vinnova, a Swedish funding organization, is
acknowledged for financial support of this study.
Pr Hkansson and Shaofang Gong is with Linkping University,
Department of Science and Technology, SE-60174 Norrkping, Sweden.
(phone: +46-11-363368, Fax: +46-11-363270, E-mail: parha@itn.liu.se)
Duxiang Wang is with Electronic Equipment Institute, P.O.Box 1610,
Nanjing, 210007 Jiangsu, P.R.China.
orthogonal frequency division multiplexing technique [3]. The

multi-band specification divides the frequency spectrum into
500 MHz sub-bands. Three sub-bands are mandatory, centered
at 3.432, 3.960, and 4.488 GHz, respectively [3]. Here, we
present a new ultra-wideband solution of an I/Q demodulator
based on the six-port principle covering the frequency range
3.1 4.8 GHz.
Another advantage with the six-port receiver is the receiver
sensitivity, which is higher compared to a standard homodyne
receiver [4]. This makes it a good candidate for tomorrows
high frequency and high data rate receivers as well as SDR
receivers. The six-port I/Q demodulator presented in this paper
utilizes an ultra-wideband correlator implemented in a
standard printed circuit board. The designed ultra-wideband
correlator shows good phase and amplitude balances. This
makes it possible to use the I/Q demodulator over a wide
frequency band without utilizing any calibration technique.
Previous publications have shown a 23-31 GHz receiver
implemented with a microwave monolithic integrated circuit
(MMIC), i.e., a relative bandwidth of 30% without utilizing
any calibration [5-7]. There are several previous showing wide
operating bandwidth utilizing calibration methods [8 - 9]. In
general, to increase the bandwidth calibration methods can be
utilized for six-port receivers [8]. However, if a six-port
receiver does not use any calibration procedure the data rate
can be increased since it reduces the signal processing
requirement [7]. This paper presents a six-port I/Q
demodulator without any calibration but with good
demodulation results in the UWB spectrum 3.1 to 4.8 GHz,
i.e., a relative bandwidth of 43%, which has never been
reported before.
II. DESIGN OF I/Q DEMODULATOR
A. Block diagram of the I/Q Demodulator
Fig. 1 shows the block diagram of a six-port homodyne
receiver. It consists of a low noise amplifier (LNA), a six-port
correlator, four radio frequency diodes, four low-pass filters
and a judgment circuit. Fig. 2 shows two different kinds of
judgment circuits, the analog and the digital judgment circuit,
respectively. The analog judgment circuit can be implemented
using an instrumentation amplifier [7]. If the received signal is
modulated using quadrature phase shift keying (QPSK) a
simple comparator can be used as the demodulator [4]. The
benefit of using a digital solution is that calibration and
compensation techniques can be utilized to compensate for
hardware phase and amplitude errors. Thus, increase the

from 3.1 to 4.8 GHz
I/Q demodulator
w3
Judgement circuit
RF
LP
Diode Filter
w4
Correlator
RFin
SixPort
LNA
AG
w5
w6
I-data
Q-data
LO
Fig. 1. Block diagram of a six-port receiver with the I/Q demodulator

in the dashed line block.
w3
I-data
w4
w5
w6
+
Q-data
+
2a
w3
w4
w5
w6
ADC
I-data
ADC
ADC
DSP
Q-data
ADC
2b
Fig. 2a and 2b. Judgment circuits (a) analog and (b) digital for six-port
recivers.
j0.5 slo + j0.5 slo
0.7 slo
slo
j0.5 slo + 0.5 srf
Port 3
Port 4
0.5 slo - 0.5 srf
srf
50
Port 5
Port 6
Fig. 3. Block diagram of the ideal six-port correlator.
Vin-
+
-
Vin+
330
330
+
560 270
where K1 is the transfer function of diodes and K2 is the

transfer function of the low pass filter. Hence, as seen in Eq. 1
the I- and Q-data are produced. However, this requires a good
amplitude balance and phase balance between the output ports,
i.e., w3 to w6 in Fig. 1. To archive the w3-w4 and w5-w6
operations shown in Fig. 2a two instrumentation amplifiers are
utilized, being the analog judgment circuit. The schematic of
the instrumentation amplifier utilized in the designed
judgment circuit is shown in Fig. 4. The purpose of the
instrumentation amplifier is to amplify the difference between
the input ports i.e., w3-w4 and w5-w6, and have a high common
mode rejection ratio.
Power divider
90 coupler
0.5 slo + j0.5 srf
0.7srf
where xBB is the amplitude, rf is the anglular frequency, and

rf is the phase of the RF input signal. The local oscillator
(LO) signal is assumed to be
(2)
slo = Alo cos( lo t + lo )
where Alo is the amplitude of the LO and lo the phase of the
LO-signal. Assuming wrf=wlo and lo=0, i.e., the so-called
coherent reception, and assuming an ideal six-port correlator
i.e., output voltages expressed in Fig. 3, diodes operating in
the square-law region and an ideal low pass filters. Then the
following I- and Q- output voltages are produced at the output
ports w3 w6 utilizing an analog judgment circuit in Fig. 2a:
w3 w 4 = 0 .5 K 1 K 2 Alo x I (t )
(3)
(4)
w5 w6 = 0 .5 K 1 K 2 Alo xQ (t )
0.7 slo
j0.7 srf
bandwidth of the receiver.

The principle of the six-port circuit as a communication
receiver is well explained in [10] by Hentschel T. If input
signal is assumed to be
(1)
srf = xI (t) cos( rf t) + xQ (t) sin( rf t) = xBB(t) cos( rf t + rf (t))
75
112
+
-
270
560
Vout
Fig. 4. The simplified schematic of the instrumentation amplifier

used as an analog judgement circuit.
B. Prototype of the ultra-wideband I/Q demodulator

The ultra-wideband six-port I/Q demodulator is
implemented utilizing microstrips on a two-layer printed
circuit board. The parameters of the substrate are listed in
TABLE I. The six-port correlator consists of three 90 branch
couplers and a Wilkinson power divider. A Wilkinson power
divider can reach 40% relative bandwidth which meets the
specification of the designed correlator. However, the
conventional branch coupler has only a relative bandwidth of
10% [11]. This paper utilizes our modified microstrip branch
coupler [12], with wideband matching networks in order to
maintain low insertion loss, small amplitude imbalance and
small phase imbalance within the operating frequency range of
3.1 to 4.8 GHz. Detailed design information can be found in
[12].
Fig. 5 shows a photo of the designed I/Q demodulator. Note
that in Fig. 5 the judgment circuit is not on the board. It is
connected to the output ports, i.e., P1 to P4. The amplifiers
used in the instrumentation amplifier are OPA3691 from
Texas Instrument Inc. The instrumentation amplifier has a
gain of approximately 10.8 dB and a -3 dB bandwidth of 110
MHz. The diodes used are zero-biased schottkey diodes BAT
15-07LRF from Infineon Technologies Inc., operating in the
square-law region. The low pass filter is designed to have a
bandwidth of 500 MHz.

from 3.1 to 4.8 GHz
113
Wilkinson power divider

P1
P3
LO
Wide band 90
Branch couplers
LP filter
Diodes
P2
P4
RF
input
optimization of the correlator are done using Momentum in

Advanced Design System from Agilent Technology Inc. Figs.
8 and 9 show the scattering parameters (s-parameters) from
the LO-input and the RF-input to the outputs of the six-port,
respectively. Fig. 10 shows the measured s-parameter from the
RF-port to LO-port i.e., from P1 to P6. It is seen that the LORF leakage is lower than -22.5 dB within the frequency range
3.1-4.8 GHz. The theoretical phase and amplitude difference
from the RF and LO inputs to the adjacent output ports should
be 0 dB and 90, respectively, as shown in Fig. 3. In Fig. 11
phase differences from the LO and RF input to adjacent output
ports, i.e., from P1 and P6 to P2-P3 and P4-P5 are seen. Fig. 12
shows the amplitude difference from P1 and P6 to P2-P3 and
P4-P5. Table II summarizes the key parameters of the
correlator from 3.1 to 4.8 GHz.
Fig. 5. Photo of the designed six-port I/Q demodulator.
21 mm
TABLE I. SUBSTRATE PROPERTIES

Dielectric thickness
0.254 mm
Relative dielectric constant
0.004
Metal thickness
25 m
Metal conductivity
58 MS/m
Surface roughness
P2
3.48
Loss factor
12 mm
1 m
3.3 mm
P3
P6
90 branch coupler
P1
Matching networks
P7
Ref
clock
Vector
Signal
Generator
LO
Instrumentation
amplifier
I
I/Q
demodulator
Oscilloscope
Vector
Signal
Analyzer
Power
divider
P4
P5
Fig. 7. Layout of the designed ultra-wideband six-port correlator.
Forward transfer function (dB)
C. Test set-up
The test set-up for the proposed six-port I/Q demodulator is
shown in Fig. 6. The vector signal generator used in the test
set-up is an SMIQ 06 B from Rode Schwartz. To generate the
LO signal a continuous sinusoidal wave from a ZVM from
Agilent Technologies Inc. is utilized. To analyze the
demodulated signals an wideband oscilloscope and a spectrum
analyzer from Agilent Technologies are used. The vector
signal analyzer software package VSA 89600 from Agilent
Technologies Inc. is used to further analyze the I/Q vector
signals.
-5
S21
S31
S41
S51
-10
-15
-20
-25
3.5
4.5
Frequency (GHz)
Fig. 6. The set-up used for measurement of the I/Q demodulator.
III. EXPERIMENTAL RESULTS

A. The ultra-wideband correlator
The key component in the proposed wideband I/Q
demodulator is the ultra-wideband correlator. Fig. 7 shows the
layout of the designed six-port correlator where P1 is the LOinput port, P6 is the RF input port and P3-P5 are the output
ports. Port P7 is terminated into a 50 load. All the design and
Fig. 8. Measured S-parameters from the LO-input (P1) to the output

ports (P2-P5).

from 3.1 to 4.8 GHz
114
-5
S26
S36
S46
S56
Ambiltude difference (dB)
Forward transfer function (dB)
-10
-15
-20
-25
-1
-2
-3
3.5
4.5
3.5
4.5
Frequency (GHz)
Frequency (GHz)
Fig. 9. Measured S-parameters from the RF-input port (P6) to the output
ports (P2-P5).
Fig. 12. Measured amplitude difference between the RF and LO inputs

and adjacent output ports.
TABLE II
SIMULATED AND MEASURED RESULTS OF THE DESIGNED ULTRAWIDEBAND SIX-PORT CORRELATOR
Simulated
Measured
< 6.0
< 7.1
Maximum phase error
(P1 to P2-P3 and P4-P5)
< 7.8
< 7.0
Maximum phase error
<1.2 dB
Maximum amplitude imbalance <1.0 dB
<1.4 dB
Maximum amplitude imbalance <1.1 dB
< -7.8 dB
< -9.1 dB
Maximum loss
(P1 to P2-P5)
< -9.1 dB
< -10.4 dB
Maximum loss
(P6 to P2-P5)
0
S61
-5
-10
LO-RF leakage (dB)
S21-S31
S41-S51
S26-S36
S46-S56
-15
-20
-25
-30
-35
-40
3.5
4.5
Frequency (GHz)
Fig. 10. Measured leakage from the RF input port to the LO input port.
100
98
Phase differece (degrees)
96
phase(S21)-phase(S31)
94
92
90
88
86
84
82
80
3.5
4.5
Frequency (GHz)
Fig. 11. Measured phase difference between the RF and LO input

adjacent output ports.
B. The I/Q demodulator

To evaluate the complete ultra-wideband I/Q demodulator
shown in Fig 5 an instrumentation amplifier shown in Fig. 4 is
connected to the output ports of the I/Q demodulator.
The measured I/Q constellation diagram for a 64-QAM
signal at the center frequency, i.e,. 3.96 GHz is shown in Fig.
13. The error vector magnitude (EVM), magnitude error,
phase error and gain imbalance in the center frequencies of the
three sub-channels of the lower band of the Multiband UWB
proposal (3.1 4.8 GHz) are shown in Table III. The data rate
during this measurement is 2 Mbps with an LO signal 0 dBm
and the RF signal -20 dBm.
Figs. 14 and 15 show the measured and simulated I- and Qoutput data with an LO signal at 3.8 GHz and 5 dBm, and an
QPSK modulated RF of -10 dBm at 2 Mbps.
Fig. 16a shows the measured eye-diagram of a 30 Mbps
demodulated QPSK signal, when the LO-signal is 0 dBm and
the RF input signal is 15 dBm. It is seen that the eye opening
is approximately 700 mV. In Fig. 16b the RF input is
decreased to -30 dBm and the eye opening reduces to 10 mV.
Accordingly, at the detectable signal level down to 10 mV a
dynamic range of more than 45 dB is measured at the center
frequency. The dynamic range in all the three bands i.e.,
3.432, 3.960 and 4.488 GHz, is above 40 dBm with an LO
power of 0 dBm.

from 3.1 to 4.8 GHz
115
0,2
(a)
0,1
Q-output (V)
0.5
Eye
opening
-0.5
-1
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
Symbol
0,0
(b)
-0,2
-0,2
-0,1
0,0
0,1
0,2
I-output (V)
Q
Fig. 13. Measured I and Q constellation diagram of a 64-QAM signal at
3.96 GHz.
0.1
0
-0.05
10
12
14
16
18
20
12
14
16
18
20
Q-output (V)
Time (us)
0.1
0.05
0
-0.05
-0.1
0
10
Time (us)
Fig. 14. Measured I and Q output with a QPSK-modulated signal a

data rate of 2 Mbps.
I-output (V)
0.005
Eye
opening
0
-0.005
-0.01
-1
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
Symbol
Fig. 16. Measured eye-diagram of the Q-output with QPSK modulated
signal at 3.96 GHz, with a 0 dBm LO signal and a RF of (a) 15 dBm and
(b) -30 dBm.
IV. DISCUSSION
0.05
-0.1
0.1
0.05
0
-0.05
-0.1
0
10
12
14
16
18
20
12
14
16
18
20
Time (us)
Q
-output (V)
Q-output (V)
0.01
-0,1
0.1
0.05
0
-0.05
-0.1
0
10
Time (us)
Fig. 15. Simulated I and Q output with a QPSK-modulated signal at a

data rate of 2 Mbps.
TABLE III. MEASURED PROPERTIES OF THE I/Q DEMODULATOR
3.432 GHz
3.96 GHz
4.488 GHz
flo = frf
Error Vector Mag 5.7
2.2
5.0
(%RMS)
Mag. Error
4.2
1.3
2.3
(%RMS)
Phase error
()
Gain imbalance
(dB)
3.1
1.7
5.6
-0.9
-0.3
-0.5
It is shown in [12] that a conventional six-port correlator has

a relative bandwidth of 10 % and a phase imbalance of is 25
within the band 3.1-4.8 GHz. To overcome the limited
bandwidth problem of conventional six-port I/Q demodulators,
calibration techniques to compensate for amplitude and phase
imbalances can be utilized [8]. However, this requires more
complex baseband digital processing. The I/Q demodulator
presented in this paper does not require these types of
calibration procedures. This can increase the data rate of the
I/Q demodulator since it reduces the signal processing
requirements [7]. The use of an analog instrumentation
amplifier to recover the I/Q outputs avoids the use of four
ADCs in the receiver [7]. This paper also presents higher order
QAM demodulation results, i.e., 64-QAM compared to
previous publications, e.q., 16-QAM [7]. The receiver in [7]
shows the capability to demodulated OPSK signals within a
relative bandwidth of 30% [7]. However, QPSK modulated
signals can be demodulated at a much higher EVM then
presented in this paper i.e., 5.7% in the three lower subbands
of UWB.
The bandwidth of the I/Q demodulator presented in this
paper covers from 3.1 to 4.8 GHz. However, the -3dB
bandwidth of the instrumentation amplifier used in this work
is 120 MHz. Thus, the used instrumentation amplifier limits
the achievable maximum data rate. The dynamic range of the
I/Q demodulator is more than 40 dB in all three sub-bands
with an LO signal of 0 dBm. To increase the dynamic range an
LNA for the RF signal with an automatic gain control can be
utilized.
The size (14x6.5 cm) of the manufactured six-port I/Q
demodulator is relatively large. Since the lowpass filters
utilizing microstrips occupy a large portion of the printed
circuit board, an implementation of the lowpass filters
utilizing discrete components will reduce the size
significantly. Furthermore, a multi-layer printed circuit board
design can be used to reduce the size significantly.
In wireless communications the received signal is distorted

from 3.1 to 4.8 GHz
116
by channel imperfections such as multi-path reflections.

Therefore, in coherent reception the carrier must be recovered
locally [13]. There have been several studies covering carrier
recovery methods such as De Costas Loop and the ReverseModulation Loop for six-port receivers [13-15].
V. CONCLUSIONS
An ultra-wideband six-port I/Q demodulator covering the
UWB spectrum from 3.1 to 4.8 GHz is presented. The used
ultra-wideband correlator is implemented in a standard printed
circuit board showing a relative bandwidth of 43%. The small
phase and amplitude imbalances of the wideband correlator,
i.e., 7 and 1.4 dB, make it possible to produce a high quality
RF signal receiver without using any calibration technique in
the UWB frequency band 3.1 4.8 GHz. The complete
demodulator shows an EVM lower than 5.7% RMS in the
three UWB sub-bands between 3.1 4.8 GHz.
ACKNOWLEDGMENT
The authors would like to express their gratitude to Magnus
Karlsson, Allan Huynh and Adriana Serban at the Department
of Science and Technology of Linkping University, Sweden,
for their assistance during the course of this work.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
T.Eireiner, T. Schnurr, and T.Mller, Integration of a six-port receiver

for mm-wave communication, IEEE MELECON 2006, Benalmdena
(Mlaga), Spain, pp.371- 376.
Li J., R.G. Bosisio, K. Wu Computer and Measurment Simulation of a
New Digital Receiver Operating Directly at Millimeter-Wave
Frequencies, IEEE Trans. Microw. Theory Tech., Vol 43,No 12,
pp.2766-2772, December 1995.
D. Geer, "UWB standardization effort ends in controversy," Computer,
vol. 39, no. 7, pp. 13-16, July 2006.
J.-C. Schiel, S. O. Tatu, K.Wu, and R. G. Bosisio, Six-port direct
digital receiver (SPDR) and standard direct receiver (SDR) results for
QPSK modulation at high speeds, in IEEE MTT-S Int. Microwave
Symp. Dig., 2002, pp. 931934.
S. O.Tatu, E.Moldovan, K.Wu, and R.G.Bosisio, A new direct
millimeter-wave six-port receiver, IEEE Trans. Microwave Theory
Tech., vol. MTT-49, no.12, pp.2517-2522, Dec. 2001.
S. O. Tatu, E. Moldovan, G.Brehm, K. Wu, and R.G.Bosisio, Ka-band
direct digital receiver, IEEE Trans. Microwave Theory Tech., vol.MTT50, no.11, pp.2436-2442, Nov. 2002.
S. O. Tatu, E. Moldovan, K. Wu, R.G.Bosisio and Tayeb A. Denidni
Ka-band analog Front-end for Software Defined Direct Conversion
Receiver, IEEE Trans. Microwave Theory Tech., vol-53, no. 9,
pp.2768-2776, Sept, 2005
F.R de Sausa and B. Huyart, 1.8 5.5 Ghz Integrated Five-Port FrontEnd for Wideband transceivers, 7th European Conference on Wireless
Technology, 2004, pp. 67 69.
T. Mack, A. Honold and J-F. Luy, An Extreamly Broadband Software
Configurable Six-Port Receiver Platform Proc. Of 33rd European
Microwave conference, Munich 2003, pp. 623 626.
Hentschel Tim. The Six-Port as a Communication Receiver IEEE
Trans. Microw. Theory Tech., vol 53, no. 3, pp 1039-1047, March 2005
G. P. Riblet, A directional coupler with very flat coupling, IEEE
Trans. Microwave Theory and Tech., vol.26, no.2, pp.70-74, 1978.
Duxiang Wang, Allan Huynh, Pr Hkansson, Ming Li and Shaofang
Gong, "Study of Wideband Microstrip Correlators for Ultra-wideband
Communication Systems," Proc. Of Asia Pacific Microwave Conf.
2007, Bangkok, Accepted for publication.
F.R de Sausa and B. Huyart, Reconfigurable Carrier Recovery Loop,
Microw. And Optical Tech. Letters., Vol 43, No5, pp. 406-408, Dec.
2004.
[14] E. Marsan, J.-C. Schiel, K. Wu Gailon Brehm and R. G. Bosisio, HighSpeed Carrier Recovery Circuit Suitable for Direct Digital QPSK
Transceivers Proc. of RAWCON 2002, pp. 103 106.
[15] F.R de Sausa and B. Huyart, Carrier Recovery in five-port receivers,
Proc. Euro Conf. Wireless Technology., 2003, pp. 419-421.
Pr Hkansson was born in Karlshamn, Sweden
in 1979. He received his M.Sc. degree from
Linkping University in Sweden in 2003.
From 2004 to 2005 he worked as a research
engineer in the research group of Communication
Electronics at Linkping University, Sweden. In
2005 he started his Ph.D. study in the research
group. His main work involves both wireless and
wired high-speed data communications.
Duxiang Wang was born in Zhenjiang City, Jiangsu, China, in 1965. He

received his BEE from Nanjing University of Aeronautics and Astronautics,
China, in 1982, and his MSEE from Shanghai University, China, in 1990.
After graduating, he joined Nanjing Electronic Equipment Institute, China,
where he contributed to microwave circuit and microwave receiver and
system design. Since 2001 he has been a professor in Nanjing Electronic
Equipment. In 2006 he had been a researcher for six months in
communication electronics at Linkping University, Sweden, as a senior
visiting scholar. Duxiang was the recipient of special award from China State
Council in 2002 and is presently a senior member of China Institute of
electronics (CIE)
Shaofang Gong was born in Shanghai, China, in
1960. He received his B.Sc. degree from Fudan
University in Shanghai in 1982, and the
Licentiate of Engineering and Ph.D. degree from
Linkping University in Sweden, in 1988 and
1990, respectively.
Between 1991 and 1999 he was a senior
researcher at the microelectronic institute
Acreo in Sweden. From 2000 to 2001 he was the
CTO at a spin-off company from the institute.
Since 2002 he has been full professor in
communication electronics at Linkping University, Sweden. His main
research interest has been communication electronics including RF design,
wireless communications and high-speed data transmissions.

Application of The Fractal Market Hypothesis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Application of The Fractal Market Hypothesis

Uploaded by

Copyright:

Available Formats

ISAST Transactions on

No. 1, Vol. 2, 2008 (ISSN 1797-2329)

Electronics and Signal Processing

Greetings from ISAST

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Modeling and Characterization of 4H-SiC

AbstractPISCES-IIB two dimensional device simulation

based on the electronic properties of 4H-SiC, a thorough

Index Terms4H-SiC, BJT, dc gain, base resistance.

There has been considerable interest in Silicon Carbide (SiC)

II. DEVICE STRUCTURE

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Fig. 1. Schematic of 4H-SiC bipolar transistors represents device A taken

Emitter active area

Emitter base spacing

Emitter thickness (m)

Emitter doping ( cm -3)

III. DEVICE MODELING

device simulator solves the drift- diffusion partial differential

doping (N) dependent carrier mobility. The values of the

are taken from [6,7,13,14].

PISCESII-B is a drift-diffusion device modeling program,

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

equations for the temperature dependent material parameters.

Where is the energy gap at zero Kelvin. Since no reports exist

where the constants = -0.5 and = +1.5 are assumed.

RESULTS AND DISCUSSIONS

In Figure 2, the maximum dc gain as a function of temperature

The focus of this work is not to extract any physical device

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Base Resistance (Norm alized)

device behavior, making it a useful tool for optimization and

Measured: Device B T=423 K

Measured: Device B T=300 K

Figure 4 shows the measured dc current gain over a range

Collector Current (A)

The leakage current is shown by simulating the base current

Base Emitter Voltage (V)

The measured differential dc gain as a function of collector

Also shown in Figure 6 is the simulated turn-on voltage

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Measured: Device B T=300 K

Base Emitter Voltage (V)

Base Emitter Voltage (V)

The simulated room temperature base current and collector

A 2D drift-diffusion simulation program is utilized to model

Perez-Wurfl, I. et al. 4H-SiC bipolar junction transistor with high

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

H. Z. Fardi received the B.S. degree in Physics from Tehran

the recipient of the Researcher of the Year Award form the

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

Evaluation of Typical Prediction Structures

Abstract The main requirements of the prediction

Index Terms Multi-view Video Coding, Prediction

dimension television is expected to be the next killer

ISAST Transactions on Electronics and Signal Prosessing, No. 1, Vol. 2, 2008

MVC prediction structures are presented and analyzed in

Fig.1 (a) Simulcast

Fig.1 (b) IPPP

Let xi be the number of frames that must be pre-decoded

And the maximum path length of random viewpoint access is

where Card is cardinality of a set. On the other hand, the

frames so as to save transportation stream bandwidth as well