Professional Documents
Culture Documents
com
Abstract
Hume-Rotherys breadth of knowledge combined with a quest for generality gave him insights into the reasons for solubility in metallic systems that have become known as Hume-Rotherys Rules. Presented with solubility details from similar sets of constitutional diagrams, can one expect articial neural networks (ANN), which are blind to the underlying metals physics, to reveal similar or better
correlations? The aim is to test whether it is feasible to predict solid solubility limits using ANN with the parameters that Hume-Rothery
identied. The results indicate that the correlations expected by Hume-Rotherys Rules work best for a certain range of copper or silver
alloy systems. The ANN can predict a value for solubility, which is a renement on the original qualitative duties of Hume-Rotherys
Rules. The best combination of input parameters can also be evaluated by ANN.
2007 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Keywords: Hume-Rotherys Rules; Articial neural networks; Solubility limit of metals; Backpropagation networks; Binary alloys
1. Introduction
Materials science seeks to understand the causative relationships between composition, processing, structure and
properties at a level that allows composition and processing
parameters to be selected to provide targeted properties.
Such relationships can be discerned by experiment and, in
a few instances, by predictive theory. They can sometimes
be obtained by molecular modelling. All molecular modelling techniques can be classied under three general categories: (1) ab initio electronic structure calculations, which are
based upon quantum mechanics; (2) semi-empirical methods, which are also founded upon quantum mechanics,
but which enhance computational speed by using approximations based upon experiment; (3) molecular mechanics,
an empirical method based on classical physics which is
computationally fast [1].
Another approach is to use correlation methods made
possible by articial neural networks (ANN), which are
nding growing acceptance in many subjects for modelling
*
1359-6454/$30.00 2007 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.actamat.2007.10.059
1095
(2) The electrochemical factor [22]: or eect of the periods of the solvent and solute atoms was quantied
by Cooke and Hume-Rothery [23] after the electrochemical factor introduced by Pauling [24]. Strongly
electropositive components are more likely to form
compounds with electronegative components than
to form solid solutions, particularly in the later B
sub-groups.
(3) The relative valency factor [22]: Other factors being
equal, a lower valence metal is more likely to dissolve
one of higher valency than vice versa, i.e., the tendency for two metals to form solid solutions is not
necessarily reciprocal. This has been found to be valid
mainly for alloys of copper, silver or gold combined
with metals of higher valency.
Traditionally, phase diagrams were determined by
(often tedious) experimentation, which nevertheless suffered from the diculty of reaching equilibrium at low
temperatures (typically <0.5Tm) in reasonable timescales
[25]. Hume-Rotherys quest for generality was thus well
founded but has, to an extent, been superseded by computational calculation of phase diagrams following pioneering
work by Kaufman and Bernstein [26] and Hillert [27] and
others in the 1970s giving rise to such tools as CALPHAD,
which has been widely used, e.g., Refs. [2833].
After the signicant work done by Hume-Rothery
et al. on the prediction of solid solubility in alloys, Darken and Gurry [34], Chelikowsky [35], Alonso and Simozar [36], Alonso et al. [37] and Zhang and Liao [38] all
contributed in dierent ways to the prediction of solid solubility in terms of a soluble/insoluble criterion.
The authors aim is to stand at the place where HumeRothery stood, with the added advantage of the ANN
and, based on a large number of constitutional diagrams
and physical parameters of metals provided by handbooks
[39,40], to simulate the process that Hume-Rothery used to
derive his rules. Hume-Rothery considered 60 systems, and
so one faces the problem of limited data. Nevertheless,
there are many situations where materials scientists would
like to benet from ANN in situations where the data set
has inherent limitations. It has been proved that ANN
methods provide an ecient tool for experimental data
analysis even when the database size is small [41]. Furthermore, this study attempts to predict the solubility quantitatively rather than produce a classication of soluble/
insoluble. Thereafter, it is attempted to extend the method
to a wider range of silver and copper alloys based on binary
systems and including dierent structures.
One interesting question is whether the parameters
Hume-Rothery used are sucient to determine the solubility. Hume-Rothery amended the rules by introducing
the structural parameter 14 years after the rst presentation of the rules [42]. In this paper, ANN have been used
to determine the relative importance by deliberately omitting one or two parameters. Another aim is to explore the
eects of the format of the input parameters. The combi-
1096
3. Data collection
The silver and copper alloy solid solubility limits (at.%)
are recorded from Massalski et al. [39], Moatt [56] and
ASM Handbook, vol. 3, Alloy Phase Diagrams [57]. The
physical parameters, radii, valences and electrochemical
factors (electronegativity) of solvent and solute atoms are
taken from Stark and Wallace [40] and from Aylward
and Findlay [58]. The valences of elements, which were
mentioned by Hume-Rothery in 1934, follow his representation. Radii of Al, Ga, and a-Fe also followed HumeRotherys representations [59]. The structure parameter is
taken from Ref. [57].
The whole data set is used in two ways: (1) the 60-alloy
systems, which were rst mentioned by Hume-Rothery in
1934, are used for training the neural network and testing
whether the Hume-Rotherys Rules work in this range of
alloy systems; (2) all the 408 silver and copper alloy systems
collected are used to repeat the process.
4. Conguration of the neural network
The ANN are constructed, trained and simulated by
MATLAB software, run under the Microsoft Windows
XP operating system on ThinkPad (IBM ThinkPad T40
Model 2373-12H: Intel Pentium M1.3 G, 256 MB, 30
GB, 14.1 in. TFT), Java VM Version: Java 1.4.2 with
Sun Microsystems Inc. Java HotSpotTM Client VM.
4.1. Number of neural network layers
A two-hidden-layer sigmoid/linear network can represent any functional relationship between inputs and
outputs if the sigmoid layer has enough neurons [60]. A
two-hidden-layer network, with a tan-sigmoid transfer
function in the rst hidden layer and a linear transfer function in the second hidden layer, is thus adopted. The number of neurons in the second hidden layer is constrained by
the number of outputs required by the problem. The output in this work is solubility, so there is one neuron in
the second hidden layer.
4.2. Number of neurons in the rst hidden layer
The choice of number of neurons in the rst hidden
layer is up to the designer. The optimum number of neurons in the rst hidden layers may be a function of (1)
input/output vector size, (2) size of training and testing
sub-sets and, more importantly, (3) the problem of non-linearity [2]. The optimum number is found by trial and error
by placing a dierent number of neurons in the rst hidden
layer for the same data set.
4.3. Improving generalization
Overtting occurs during network training if the error
associated with the training set becomes low, but the error
1097
1098
a = b = c 6 90; a = c = 90 6 b; a 6 b 6 c 6 90)
and (3) crystal system (simple; base-centred; facecentred; body-centred).
6. Determination of the output parameters
In Hume-Rotherys Rules, a soluble/insoluble criterion
is described. However, it would be more advantageous to
attempt to predict the original value of the solubility. The
output parameters, which are the solubility limits of each
alloy system, are therefore expressed in two ways:
(1) Follow the specialized criterion: if the solubility of
solute metal in solvent metal exceeds 5 at.% [38,62],
it is said that this solute metal is soluble in the solvent
metal.
(2) Use the original maximum solubility limits of each
alloy system.
7. Results
The data are used in two ways: (1) the rst 60-alloy systems mentioned by Hume-Rothery in 1934 are used as a
start; (2) the whole 408-alloy systems are then used to test
whether Hume-Rotherys Rules work for copper and silver
alloy systems in general.
7.1. Testing Hume-Rotherys rules within 60-alloy systems
Of the four parameters, the size factor, the electrochemical factor and the relative valency factor were those used
by Hume-Rothery in 1934, so these are used in initial tests
for predicting solubility using the following criteria:
1. If the atomic diameters dier by more than 14%, it
means that the size factor is unfavourable, and the
input number for this parameter is zero, or it is one.
2. If the valency of solvent atom is lower than that of solute atom, the input number for this parameter is one, or
else the number is zero.
3. If the dierence of the electronegativity of solvent and
solute atom is more than 0.4, mentioned by Darken
and Gurry [34] and Zhang and Liao [38], the input number for this parameter is zero, or else it is one.
4. If the solubility of solute metal in solvent metal exceed
5 at.%, the output number is one, or else it is zero.
In this case, the problem to be solved is a classication
problem, so a probabilistic neural network [54,55] is
designed for use. This is a radial basis network suitable
for classication problems. The modied criterion of 15%
for the diameter dierence mentioned by Darken and
Gurry [34] is used in the next trial, also with soluble/insoluble as output, and both results are listed in Table 1 in
terms of the percentage of all the 60 predictions that are
wrong. Comparing these results, a slight dierence is
found: the percentage error of the predicted results based
Table 1
Testing Hume-Rotherys Rules with 60-alloy systems using his criterion
(14% variation), the later suggestion of 15% and the 15% criterion with
structural identity (same or not)
Choice (%)
14
15
Structure parameter
Testing
Whole data
8.3
8.3
8.3
15
13
13
Table 2
Testing Hume-Rotherys Rules using original parameter values
M
Training
Testing
Training
Testing
Training
Testing
0.984
0.193
0.158
2.75
0.996
0.383
80
60
40
20
0
-20
0
R = 0.993
50
Experimental (T)
100
1099
80
60
40
20
0
-20
R = 0.992
0
50
Experimental (T)
100
Data Points
Best Linear Fit
40
A=T
20
0
-20
R = 0.992
0
50
Experimental (T)
100
Fig. 2. Prediction of solubility using three functionalized parameters for the 60-alloy system data set: atomic size, valency and electronegativity. (a)
Training set; (b) testing set; (c) whole set.
1100
80
60
40
20
0
-20
0
R = 0.985
50
Experimental (T)
100
80
60
40
20
0
-20
R = 0.976
0
50
Experimental (T)
100
40
20
0
R = 0.975
-20
50
Experimental (T)
100
Fig. 3. Prediction of solubility using four functionalized parameters for the 60-alloy systems: atomic size, valence, electronegativity and structure. (a)
Training set; (b) testing set; (c) whole set.
to mean error is, in all but one case, greater than unity.
These trends in the assessment criteria are consistent for
both the testing set and the whole set.
The best predictive results for the network are obtained
using the functionalized values of atomic size, valence and
electronegativity to predict the original values of solubility
for the 60-alloy data set used by Hume-Rothery himself,
and the data are plotted in Fig. 2. Inclusion of the structural factor using the parameter described above weakens
the predictive power of the network (Fig. 3). The reason
for this slightly counter-intuitive nding is that crystallographic compatibility is likely to become more important
at higher solubility levels, being essential for continuous
solubility. However, the majority of data are at the low solubility end, where substitutional atoms are at a low coordination number. Another reason is that the number used to
represent structure actually conceals crystallographic similarities, as discussed in more detail below, and there is not
enough training data for the network to establish these similarities by itself. The structure parameter is used to assess
the criterion for solubility that the same crystal structure
for the two elements favours a wide solubility range [64].
This makes it a type of classication problem, not completely the same as a mapping problem, and it could be
argued that including it in this type of network is
inappropriate.
1101
Table 3
Comparison of criteria for predicting solubility using dierent combinations of parameter groups
Conditionsa
Test set
Whole set
Mean modulus of
error (at.%)
0.992
0.0579
2.46
0.976
0.168
6.98
0.695
0.662
7.01
SD of modulus of
error (at.%)
Mean modulus of
error (at.%)
3.21
0.992
0.0422
1.65
1.94
4.58
0.975
0.0851
3.21
3.21
0.768
0.631
6.30
14.1
SD of modulus of
error (at.%)
12.7
80
60
40
20
R = 0.79
0
-20
0
50
Experimental (T)
100
100
50
0
R = 0.695
-50
50
Experimental (T)
100
100
Data Points
Best Linear Fit
A=T
50
0
R = 0.768
-50
50
Experimental (T)
100
Fig. 5. Prediction of solubility using three functionalized parameters for the 408-alloy systems: atomic size, valency and electronegativity: (a) Training set,
(b) testing set, (c) whole set.
1102
independent of each other. The importance of the structural parameter has been tested and found not to play a
very important overall role, although of course it does
inuence the possibility of continuous solubility.
The relative importance of size factor, valence and electronegativity is compared in Table 4. Using the same procedure (functionalized parameters including structure), the
network is run with one parameter omitted at a time on the
set of 60 systems.
In general, mean error (data columns 3 and 7) varies
inversely with regression coecient (data columns 1 and
5), and the standard deviation of error is between 1.1 and
1.8 times higher than the mean error. Using the mean error
of the testing set as our main criterion for accuracy of prediction, the parameters atomic size, valence and electronegativity provide the strongest prediction of solubility and, of
these, atomic size has the strongest eect because, when it is
omitted, the error is highest (data row 2). Electronegativity
appears to have a stronger inuence than valence (data rows
3 and 4). In fact, these parameters are not wholly independent of each other. As mentioned by Hume-Rothery, they
are related, and their interplay makes the determination
of solubility very dicult [22]. As a result, determining the
relative importance of each parameter is not easy; it can
only be said descriptively that the atomic size and electro-
Table 4
Comparison of criteria for predicting solubility using dierent combinations of three parameters
Conditionsa
Test set
Whole set
Mean modulus of
error (at.%)
0.992
0.0579
2.46
0.867
0.308
8.19
0.93
0.365
3.17
0.569
0.477
7.73
SD of modulus of
error (at.%)
3.21
11.7
4.19
13.8
Mean modulus of
error (at.%)
SD of modulus of
error (at.%)
0.992
0.0422
1.65
1.94
0.924
0.197
4.20
6.34
0.968
0.142
3.46
3.66
0.761
0.613
7.07
11.0
Table 5
Comparison of criteria for predicting solubility using dierent combinations of two parameters
Conditionsa
Test set
Whole set
0.852
0.679
0.470
0.860
4.47
6.99
0.675
0.91
0.889
0.153
0.459
0.607
Mean modulus of
error (at.%)
SD of modulus of
error (at.%)
Mean modulus of
error (at.%)
SD of modulus of
error (at.%)
2.50
4.83
0.496
0.495
1.36
1.36
9.52
10.3
14.3
13.8
10.2
7.31
6.42
10.1
0.441
0.925
1.50
0.184
10.7
4.54
14.4
6.06
1.02
12.4
11.9
0.662
0.886
9.82
11.3
1.30
11.3
21.1
0.524
1.35
9.00
14.5
1103
1104
1105