You are on page 1of 9

O

tober 10, 2014

Models of Data and Inverse Methods

Paul Humphreys

Fifty years after its publi ation, Patri k Suppes' paper `Models of Data' (1962),
stands as a remarkable a hievement in the philosophy of s ien e. I shall briey
lay out some of the entral features of his paper and then use the basi idea of a
hierar hy of models to explore the relation between data and inferen e in a very
dierent setting than the one that Suppes used. Let me emphasis at the outset that
I am not attempting an exegesis of Suppes' paper and that I am using many of the
ideas outside their original domain of appli ation. The theoreti al ontexts to whi h
Suppes applied his hierar hy of models dier in important ways from the ontexts
that I shall dis uss, the most important of these being that Suppes was interested
in situations where there exists a well established parametri theory of the top level
domain and statisti al estimation of the parameters from experimental data is the
goal. The present proje t is to see how inferen es from data are onstrained when
the quantity to be estimated is not part of an expli it theory.
Here are some of the main points of Suppes' paper.

Foundations and Methods from Mathemati s to Neuros ien e:


Essays Inspired by
Patri k Suppes.
Kenneth J. Arrow, Nan y L. Cartwright
Claudio Carvalhaes, Colleen Crangle
A a io Jose de Barros
Adolfo Gar ía de la Sienra Guajardo
Anne Fagot-Largeault, Jean-Claude Falmagne
Jens Erik Fenstad, Dagnn Follesdal
Harvey Friedman, Mar os Perreau-Guimaraes
Russell Hardin, Stephan Hartmann
Jaakko Hintikka, Paul W. Humphreys
Hannes, Leitgeb, Willem Levelt
Elizabeth Loftus, Thomas A. Ry kman
Brian Skyrms
Commentaries written by
©
Patri k Suppes.
Copyright 2014, CSLI Publi ations.
3
tober 10, 2014

4 / Paul Humphreys

A. A hierar hy of models exists between theory and data when the latter are on-
ne ted with the former. This a ount of how empiri al ontent is inje ted into
formal theories is signi antly more sophisti ated and detailed than was the
earlier logi al empiri ists' use of oordinating denitions. One virtue, onsistent
with the semanti a ount's de-emphasis on parti ular synta ti formulations of
theories, is that Suppes' approa h is largely un on erned with linguisti meaning
and fo uses instead on how quantitative estimates are provided for parameters
that o ur in abstra tly formulated theories.
B. The models in the hierar hy are of dierent logi al type. There is a variety
of reasons for this. One is that there are on epts in the theory that have no
observable analogues in the experimental data, a point that anti ipates in er-
tain respe ts the well-known distin tion between phenomena and data made in
Bogen and Woodward (1988). A se ond reason is that some levels in the hierar-
hy will ontain models with ontinuous variables or innite data sequen es while
others will have models using dis rete variables and nite data sets only. This
distin tion between ontinuous and dis rete versions of models was later riti al
to understanding how the models that drive omputer simulations dier from
the ontinuous mathemati al models ba k of the simulations. These moves are
mathemati ally non-trivial and often require additional orre tion te hniques to
avoid errors and artifa ts arising from the dis rete approximations.
C. In order for there to be models of the data there has to be a theory of the gen-
erating onditions for the data. Put another way, the models of data with whi h
Suppes is on erned are perhaps better alled `models of data from orre tly de-
signed experiments'. For example, within the learning model that Suppes uses
as a running example, there are possible realizations of the data that fail to
satisfy the stationarity ondition that applies to the reinfor ement s hedule in a
model of the experiment. Although su h realizations ount as possible data, they
are inappropriate for estimating the parameter θ of the linear learning model.
The need for a theory of the experiment makes the position radi ally dierent
from most empiri ist a ounts of data in whi h minimal referen e to theories is
desirable and the generating onditions of the data are frequently not a part of
the empiri ist analysis.
D. One of the themes of Suppes' arti le is that many details of an experimental
arrangement annot be in luded in the hierar hy of models either be ause they
annot be ou hed in terms of the language of the theory to be tested or be ause
they involve heuristi s that annot easily be formalized. This orientation is part
of a more general methodologi al laim that `...the only systemati results pos-
sible in the theory of s ienti methodology are purely formal...' (p. 261). This
position is appealing, at least in the sense of seeing how far a purely formal
analysis of data pro essing an be taken.

1.1 Computerized Tomography and Inverse Methods


My running example will be data generated by imaging devi es that use omputed
tomography (CT). These instruments take data that originate with physi al sour es,
su h as X-rays, and use omputationally intensive pro essing of that data to pro-
du e the nal image. Be ause of spa e onstraints, I shall provide only the basi
O tober 10, 2014

Models of Data / 5

features of these instruments; further details an be found in Humphreys (2013b,


Forth oming).
One of the ore features of CT instruments is that they onstru t solutions to
inverse problems, a kind of problem that o urs in many other ontexts su h as
geophysi s and radioastronomy. Although an inverse problem is sometimes taken
to be one that involves an inferen e from data to model parameters, and hen e is
dire tly related to Suppes' framework, there is not a uniform use of the term and
generi ally an indire t problem involves an inferen e from the data to the generating
onditions of that data. One of the morals of this paper is that philosophers an
benet from methods that have been developed to solve inverse problems sin e they
are dire tly relevant to a number of topi s in the philosophy of s ien e, in luding
s ienti realism and the appli ation of mathemati s. More on that later.
In two dimensional omputerized tomography instruments, M parallel X-ray
beams, ollimated to lie in a plane, traverse the obje t to be imaged and impinge
on M dete tors on the far side of the obje t. The energy of the X-rays is attenuated
by traveling through the obje t and the degree of attenuation depends upon the
varying densities of the materials through whi h the X-rays are traveling. From the
dete tor measurements and the initial intensities, the total attenuation along ea h
ray is easily al ulated. The sour es and dete tors are then rotated by an angle π/N
and the pro ess is iterated. (Here, the value of N determines at how many points
around the half- ir le the X-ray beams are triggered.) Possible realizations of the
data thus onsist of an M × N matrix of rational numbers.
 
r11 . . . r1N
 .. .
.
. 
. 
 . . .
rM1 . . . rMN

where rij ∈ [0, Imax ]. I note that from the perspe tive of the algorithms involved, it
is of no importan e whether su h possible realizations ome from a tual measure-
ments, from simulations, or are simply numeri al arrays.
The inverse problem is then to onstru t a two dimensional image of the target
in the plane of the beams from that data about total attenuation. The image an
be represented by a fun tion µ on the Cartesian plane, where µ(x, y) is the value
of the X-ray attenuation oe ient at the point (x, y). The attenuation oe ient
values in a given region are strongly asso iated with the density of the material in
that region and the former an therefore be taken to represent the latter. The most
ommon onstru tion method is ltered ba kproje tion.
In broad outline, ltered ba kproje tion ontains these steps:

Step 1 Cal ulate the total attenuation along a given ray between a sour e and its
dete tor by integrating the values of µ along that ray. This gives a represen-
tation of the data values in terms of µ.
Step 2 Convolve these spatial proje tions with a lter. The lter ompensates for
distortions introdu ed into the representations by oordinate transformations
in dis rete models.
Step 3 Fourier transform these onvolutions into the frequen y domain.
Step 4 Compute the onvolutions as produ ts in the frequen y domain.
Step 5 Inverse Fourier transform the results ba k to the spatial domain. Steps
3 through 5 are primarily to a ommodate omputational load onstraints,
tober 10, 2014

6 / Paul Humphreys

something that does not appear in traditional analyses of models but that is
of entral on ern in the real-life appli ation of this methods.
Step 6 Compute the inverse Radon transforms in the spatial domain to arrive at
1
values of µ(x, y) for the desired points (x, y) within the target frame. I note
here that these inverse transforms introdu e a severely non-lo al aspe t to
the relation between data and image, in that a given image pixel is re on-
stru ted from multiple ba kproje tions taken at dierent values of θi and a
given datum ontributes to the re onstru tion along the whole ray asso iated
with that datum. This requires a very dierent attitude towards data or-
re tion than does the usual ompositional approa h to whi h philosophi al
dis ussions are largely dire ted.

Now, instead of the target frame, onsider a dete tor frame whi h is oriented
at an angle θ to the target frame. Ea h ray an be represented mathemati ally
by the line parameterized by r and θ: Lθ (r) = {(x, y) : r = x cos(θ) + ysin(θ)}
where
R r is the radial oordinate. The total attenuation along the line L is given by

L
µ(x, y)dL. This represents proje ted values of µ(x, y) along the ray orthogonal to
the r axis of the dete tor frame when it is oriented at angle θ to the target frame.
To represent this value using the target frame oordinates we have:
R∞ R∞
(1) Šθ (r) =
y=−∞ x=−∞
µ(x, y)δ(xcos(θ) + ysin(θ) − r)dxdy
whi h is the Radon transform of µ over L. It is here that we have the rst onne tion
between the data values and a formal representation of them. The Radon transform
is ontinuous, as is the theoreti al Fourier transform of steps 3 and 5, but the
omputational implementations of these are inevitably dis rete and this move shows
that we must step down to logi ally dierent type of models even in the absen e of
an expli it theory of the phenomena.
Perhaps the most important dieren e between the appli ation dis ussed here
and Suppes' original a ount is that the values of µ o ur as stand alone values
rather than as parameters of a broader theory. Data analysis in the absen e of theory
or hypothesis testing has be ome in reasingly important in re ent years be ause of
the enormous in rease in data that is available in high energy physi s, astrophysi s,
limate modeling, nan ial markets, and other areas and there have been interesting
suggestions that non-theoreti al approa hes to data may be the most appropriate
methods in ertain areas. (See e.g. Napoletani et al. (2011), Humphreys (2013a))
Rather than theories and their asso iated models, our fo us is thus methods that
operate on data. Despite the absen e of a parametri top level theory, the goal of
a urately estimating values of µ(x, y) from the data ts the general motivation
behind Suppes' restri tion that `The entral idea...is to restri t models of the data
to those aspe ts of the experiment whi h have a parametri analogue in the theory.'
(258). In the present ase, models a t not as possible realizations of a theory but
provide onstraint onditions on the methods used to transform data.

1 The inverse Radon transform f (x, y) an be represented by 1 R ∞ d H[R(r, y−rx)]dr , where


2π −∞ dy
H and R are Hilbert and Radon transforms, respe tively. Thanks to Tom Ry kman for pointing

this out.
O tober 10, 2014

Models of Data / 7

1.2 Models of the Instrument


Next are models of the instrument. Suppes' fo us was data from laboratory exper-
iments that are used to estimate a parameter in luded in a linear learning theory.
In our ase we must have a theory of the instrument rather than a theory of the
experiment and models of the data will then be possible realizations of the data that
satisfy the theory of the instrument. Suppes is right to emphasize that in his exam-
ple the learning theory itself is assumed to be orre t and is not being tested. With
imaging devi es of the kind onsidered here, we are also not testing the theories
that lie behind the design of the instrument and the interpretation of the inverse
inferen es. They are su iently well established that they are taken to be true;
nobody seriously questions the existen e of X-rays and their experimentally estab-
lished properties. We do have a theory of the instrument that determines in generi
terms what kind of data should be produ ed. Here the physi al theory behind the
instrument design is, in its basi form, extremely simple: it simply asserts that
X-rays are transmitted from a dis rete array of sour es, are transmitted through
the target obje t with varying degrees of absorption depending upon the material
omposition of the target, and the intensity of the arriving X-rays is re orded by a
dis rete array of dete tors along a line parallel to the set of sour es. This apparent
simpli ity is misleading, for ompli ations su h as Compton s attering, photoele -
tri absorption, dete tor noise, partial volume artifa ts, and many others have to
be taken into onsideration. In addition, the beams must be properly ollimated,
the intensity of a given sour e must be onstant a ross all indi es i and j, orre -
tions must be made for dete tor ine ien ies, and so on. I list these not simply to
note that the path from data to image is thoroughly infused with models but that
many, if not most, of these fa tors not only an be formally represented but must
be in order for the image onstru tion algorithms to make appropriate orre tions.
The theory of the instrument is then aptured by a omplex and interlo king set
of models that represent both physi al and omputational pro esses. Just as in the
ase of experiments, we have to provide these models of the instrument in order
to ensure that a possible realization of the data ounts as a model of the data but
there is one aspe t that requires spe i dis ussion and whi h raises an important
issue onne ted with Suppes' point C above.
Whether it is desirable or even a eptable to alter and adjust data from instru-
ments and experiments is a deli ate methodologi al issue. Although adjusting a
data set so that it is a model of the data will usually leave the data matrix within
the lass of possible realizations of the data, both the pro ess and the motivation in
the ase of data orre tion dier from situations in whi h the model of the data was
arrived at as raw data from an experiment. In the ase of an experiment, be ause
a model of the data is a sequen e of possible data points from a well ondu ted
experiment, we are entitled to ex lude as a model of the data a sequen e of data
points that violates what we know about the generating onditions that onstitute
the experiment. Analogously, in the ase of instruments we are entitled to adjust
data that are generated by the instruments in the light of theories and models that
apply to those generating onditions. One of the things that is of parti ular interest
in the CT instruments is to avoid artifa ts of the instrument and this is only possible
by adjusting data sets using the networks of models of the instrument mentioned
above. These artifa ts an arise either from physi al pro esses in the instrument or
tober 10, 2014

8 / Paul Humphreys

from omputational pro esses in the image produ tion. A simple example of this
involves beam hardening whi h results from the omplete absorption by the target
of all X-rays that fall below a threshold energy level, resulting in errors in the image
onstru tion. Corre tions for beam hardening an be made by either using physi al
lters or by software orre tion algorithms, both of whi h an be formally modeled.
Su h models do not t into a neat hierar hy. The order in whi h they are deployed
an vary and their use may require iterated y les of appli ation. That said, the
use of models of the instrument stands in sharp ontrast to an empiri ist tradition
that views theoreti al transformations of, and hanges to, so- alled `raw data' as
epistemologi ally ounterprodu tive. Yet for CT instruments and elsewhere, su h
data manipulation is not merely desirable but ne essary in many ases to arrive at
a urate outputs. This of ourse has its dangers and an expli it a knowledgement
of su h manipulations is required. This point must be distinguished from a better
known issue that I shall now dis uss.
One of the hallenges of dealing with the kind of data that omes from instru-
ments is to relate the languages of all the dierent theories that ome into play in
the operation of the instrument and the generation of the data. Be ause the entire
inferen e and re onstru tion pro ess from data to image is automated, Suppes' goal,
stated in point D, of restri ting the analysis of models to formal methods an be
satised in a ompletely general way. Although it is undeniable that some aspe ts of
experimental pro edure and of instrument use require heuristi tri ks of the trade
to su essfully produ e reliable data, one should not underestimate the extent to
whi h many su h adjustments an be given formal representations.
Let me relate this to point A above. The view that empiri al ontent diuses
through the entire theoreti al apparatus of s ien e, ae ting even the most math-
emati al parts, has long dominated the philosophy of s ien e and this broadly
Quinean view has itself be ome something of a dogma. In a previous publi ation
(Humphreys (2008, Se tion 4)) I argued that in the ase of probability theory one
an preserve the purely formal hara ter of the measure-theoreti formulation of
probability theory and the statisti al models that are used to apply it, by restri t-
ing the empiri al input to a mapping between the last statisti al model and the
data. Suppes' hierar hy of formal models gives support to this position in the ase
where a general ba kground theory exists, espe ially sin e he notes (p. 252) that for
present purposes, it is unimportant whether the formalization takes pla e within
the semanti or the synta ti a ount.

1.3 Inverse Problems


In a more general setting, moves from observations to unobservable entities an be
onsidered as inverse inferen es from the data to their sour e. For philosophers, the
problem of s ienti realism an then be seen to onsist in providing a solution to a
parti ular type of inverse problem. On e we view matters in this way many familiar
philosophi al issues take on a dierent ast. An inverse inferen e problem is said
to be ill-posed just in ase there is either not a unique solution to the problem or
the solution does not depend ontinuously on the data. Thus, if we an show that
the representation of an inverse problem is well-posed, this gives us a response to
ertain types of anti-realism, albeit at the pri e of shifting the indu tive problem to
justifying the use of that representation for the ase at hand. It is regularly said, and
O tober 10, 2014

Models of Data / 9

orre tly, that su h inferen es from the observable to the unobservable are always
under-determined. Yet there are well developed te hniques within the area of inverse
methods to deal with su h under-determination. For example, given a nite data
set, the inverse Radon transform based on those data is not unique. Yet by using
results su h as the Nyquist sampling theorem, whi h determines how often a given
ontinuous fun tion must be sampled for the sampled signal to ontain the same
amount of information as the ontinuous signal, errors in the re onstru ted image
an be drasti ally redu ed. (See Buzug (2008, pp. 135 .))
Finally, there is one point made by Suppes that I believe needs interpretation.
He says that `From a on eptual standpoint the distin tion between pure and ap-
plied mathemati s is spurious - both deal with set-theoreti entities and the same
is true of theory and experiment.' (p. 260) Interestingly, this view is maintained
in his omprehensive treatise (Suppes (2002, p. 33)) while expli itly addressing the
nitisti hara ter of mu h applied mathemati s (ibid. pp. 303-311). Viewed from
an abstra t perspe tive, this is orre t but that abstra tion disguises some philo-
sophi ally important dieren es. Put in the form of an aphorism, we an say that
applied mathemati s is not always the appli ation of pure mathemati s. The rea-
son is this: in some areas of applied mathemati s, and optimization methods are a
good example, theorems do not exist that guarantee the su ess of a method when
applied to parti ular situations. Rather than the dedu tive pro edures that are
usually dis ussed in the philosophy of mathemati s literature, heuristi ally justied
trial and error pro edures are often used that lend an indu tive aspe t to applied
mathemati s. Non onstru tive proofs have been mu h dis ussed in the philosophy
of mathemati s, but pure existen e proofs in applied mathemati s raise issues that
are important to the philosophy of s ien e, not the least be ause existen e results
for optimization pro edures do not always provide an ee tive method for nding
the optimum. Take one example of a standard numeri al methods pro edure, at-
tempting to nd the global minimum of a fun tion. Given an obje tive fun tion f,
an optimization method will arrive at either a lo al or a global minimum but with
many ompli ated fun tions, there will be no proof that the minimum rea hed is
global rather than lo al. Consider this example:
n
A set S ⊆ R is onvex if it ontains the line segment between any two of
its points, i.e., {αx + (1 − α)y : 0 ≤ α ≤ 1} ⊆ S for all x, y ∈ S . A fun tion
f : S ⊆ Rn → R is stri tly onvex on a onvex set S if its graph along any
line segment in S lies on or below the hord onne ting the fun tion values at the
endpoints of the segment, i.e., if f (αx + (1 − α)y) < αf (x) + (1 − α)f (y) for all
α ∈ (0, 1) and all x 6= y ∈ S . Then any lo al minimum of a stri tly onvex fun tion
f on a onvex set S ⊆ Rn is the unique global minimum of f on S . But it is often
impossible to determine for a given obje tive fun tion f whether it satises the
onvexity onditions needed for the theorem. Methods su h as steepest des ent and
onjugate gradient an be used to onverge on a solution if it exists but annot be
guaranteed to su eed.
The general point is straightforward. The distin tion between pure and applied
mathemati s is not always sharp, but assuming some su h distin tion an be made,
results in pure mathemati s that are used in s ien e ome with a set of onditions
that must be satised in order for those results to orre tly be applied. This is as true
for arithmeti al results as it is for martingales used on time series data from nan ial
markets. While the mathemati al results themselves an be assessed a priori, the
tober 10, 2014

10 / Paul Humphreys

truth of the appli ation onditions for a given situation usually annot and it is
in the absen e of su h a guarantee that heuristi methods must often be applied.
Many of those methods, whi h are a legitimate part of applied mathemati s, an
be formally represented but the standards of rigor expe ted of pure mathemati s
must be relaxed. So I do not disagree with Suppes' laim on erning the on eptual
equivalen e of pure and applied mathemati s when applied mathemati s is taken
as a self- ontained subje t, but more broadly onstrued it an support a dierent
epistemologi al attitude while retaining its formal qualities.

1.4 Con lusion


`Models of Data' is a anoni al element in Suppes' set-theoreti al approa h to the-
ories. As data be ome in reasingly available, often in vast quantities, their relation
to theory is evolving, espe ially in situations in whi h the role of expli it theory
is small. I have tried to show, albeit in a spe ialized domain, that expli it models
of the data, of the instrument, and of the transformations made on the data, are
onsistent with Suppes' overall message that formal methods an arry us a long
way towards a philosophi al appre iation of te hniques that are rapidly advan ing
in importan e yet do not t the standard pi ture of a onne tion between theory
and data.

Referen es

Bogen, J. and J. Woodward. 1988. Saving the phenomena. The Philosophi al Review
97(3):303352.

Buzug, T. M. 2008. Computed Tomography: From Photon Statisti s to Modern Cone-Beam


CT . Berlin: Springer.

Humphreys, P. 2008. Probability theory and its models. In D. Nolan and T. P. Speed,

eds., Probability and Statisti s: Essays in Honor of David A. Freedman , pages 111.

Bea hwood, Ohio: Institute of Mathemati al Statisti s.

Humphreys, P. 2013a. Data analysis: Models or te hniques? Foundations of S ien e


18(3):579581.

Humphreys, P. 2013b. What are data about. In E. A. et al., ed., Computer Simulations
and Experiments . Cambridge: Cambridge S holars Publishing.
Humphreys, P. Forth oming. X-ray data and empiri al ontent. LMPS XIV: Pro eedings

of the 14th Logi , Philosophy, and Methodology of S ien e Congress, P. Bour et al.

(eds). London: College Publi ations.

Napoletani, D., M. Panza, and D. C. Struppa. 2011. Agnosti s ien e: Towards a philosophy

of data analysis. Foundations of S ien e 16(1):120.


O tober 10, 2014

Referen es / 11

Suppes, P. 1962. Models of data. In E. Nagel, P. Suppes, and A. Tarski, eds., Logi ,
Methodology, and Philosophy of S ien e: Pro eedings of the 1960 International Congress ,
pages 252261. Stanford: Stanford University Press.

Suppes, P. 2002. Representation and Invarian e of S ienti Stru tures . Stanford, CA:

CSLI Publi ations.

You might also like