Adventures in the network of
theories.
Andreka, H. and Nemeti, I.
Talk
given in Salzburg Conference for Young Analytic Philosophy 2016, Sept. 7 http://www.sophia-conference.org , and also presented
at the Renyi Institute of Mathematics on Sept. 25, 2016. http://www.renyi.hu/~seminar/
Below comes the full text of
the talk, associated to the slides of the presentation.
1. Network of theories.
Network of theories is an efficient tool for organizing scientific knowledge. We
want to share our adventures in this network with you.
Part I. Theory of network of
theories
2. We are in many-sorted
first order logic with equality.
The nodes in the network are
theories. A theory consists of a language and some axioms. When we design a
theory, first we make clear in our minds what kinds of objects we want to talk
about, Animals? Coordinate systems?, Events?, and what relations we want to use
between those. These will be the atomic sorts and atomic relations of our
language. We call them atomic because we do not analyse these concepts further
in the language. Thus the language sets the abstraction level of our theory.
After having decided about the language, we postulate the axioms of our theory.
With these, the theory has been specified.
Language and Axioms.
3. So, the language in a
theory determines the abstraction level of the theory. But the time comes when
we want to change the level of abstraction of our theory. We want to analyze
further the atomic sorts or relations, or just the opposite, we may want to
talk on a higher level of abstraction. The acts of changing the abstraction
level form the connections part of the network.
A connection is interpreting one theory into
another. This is a special function mapping the language of the first theory T1
into the language of the second one, T2. The idea is that atomic sorts and
relations of the first language go to compound sorts and relations of the
second one.
Thus, an interpretation from T1’s point of
view means analyzing the atomic relations further, and this corresponds to
reducing T1 to T2. In the opposite direction, from T2’s point of view the
interpretation means generalization. Also, an interpretation of a theory into
T2 will show how theoretical notions emerge in T2, the interpretation may
reveal rich structure in an otherwise simple-looking world. It may add to T2,
we would say we can add understanding to T2 this way. An example for all these
aspects is the interpretation that we present beginning with slide 12.
So, the slogan is that an interpretation
assigns to atomic sorts and relations compound ones. We can imagine what
compound relations are: the predicate “big and not red” is a compound relation/predicate composed of
the atomic “big” and atomic “red”. But what are compound sorts?
4. In natural language, when
we talk about people and family relations between them, we may talk about
married couples as if they were entities of their own. In this case “married
couples” is a new sort composed of “people”
and “to marry”.
Our generic example will be
the compound sort of lines in a geometry that talks only about points and the
collinearity relation. So, assume that we are in a geometry where the objects
are points and we have one ternary relation, collinearity. When working in such
a geometry, when trying to prove theorems, we also think about lines, in an
intuitive way. But when we write down a proof in which lines occur, we eliminate the intuitive notion of a line
from the proof since the proof is
supposed to talk only about points and collinearity. What we will do now is
that we introduce the compound sort of lines officially with a precise
definition of how to eliminate it as an
abbreviation, and then we will not have to eliminate them from the proofs any
more, since there will be an official, royal eliminating procedure. This will
make our thinking more efficient.
So, what are lines? Lines are sets of
collinear points, a line can be specified by any two of its distinct points as
the set of all points collinear with these two. We say that a line is coded by
a pair of distinct points, any pair of distinct points name, or denote a line.
Thus, pairs of distinct points are the names, or codes for lines. Let’s take a
line. Any of its two distinct points will code it, and we can have no
preference for which one of these pairs to choose. Thus, the relation between codes and lines is
not one-to-one. Do not panick! We define an equivalence relation on the names:
two names are equivalent if they code the same line. Now there is a one-to-one
correspondence between lines and equivalence blocks of this relation. This is
the idea. Let us give now existence and a new home to these new entities.
5. The formula epsilon
here codes the equivalence relation between names. It is an old formula,
i.e., a formula in the language of points and collinearity. The axioms prove
that it is symmetric and transitive. Let us add a new sort-symbol S_eps to the language, we use it as the home
of the lines. We also add a new function symbol
q_eps, for “quotient-forming”, to the language. It is a function
connecting the new sort to the old ones, we call it the neck of the new sort.
The situation shown in the picture, namely that
q_eps establishes a one-to-one
relation between the elements of the new sort S_eps and the blocks of epsilon can easily be described completely by
a formula that talks about q_eps, S_eps
and epsilon. We call that formula the explicit definition of S_eps. This explicit definition makes it
possible to eliminate the new symbols whenever we want. We add this definition
as a new postulate of the extended theory.This is analogous to how we define a
new relation symbol. We add its symbol to the language and we add its
definition to the theory.
6. Here we repeat the same process
in the case of an arbitrary theory T.
Let epsilon be any formula in the language of T and assume that the axioms prove that
epsilon is symmetric and transitive.
Then we add a new sort symbol S_eps, a new function symbol q_eps to the language, and at the same time
we extend the axioms with the explicit definition of this new sort. Then we get greedy, and we do this for every
definable partial equivalence relation
epsilon of T. Thus we add infinitely many new sort and
relation symbols, and infinitely many new axioms. We call the theory we
get the conceptual completion of T and
we denote it with Tc. The name was given
by Michael Makkai, and we think that it is a good name. Why? Because Tc
contains all the new sorts that we can define in T. Assume that you come up
with a totally different kind of new sort and convince us that it is a
definition. Then we will show you that your new sort is already there in Tc. (There is a theorem to this effect, a
Beth-type theorem.) Finally, there is a royal translation function from Tc
to T.
Summing up, in constructing Tc
from T, for every definable partial equivalence
relation epsilon we add
S_eps and q_eps
to the language of T (together
with the corresponding defining formulas). We allowed empty sorts.
7. We constructed Tc to be able to define what
compound sorts are. We define compound sorts of
T to be the sorts of Tc. Compound relations of T are
the open formulas of Tc.
8. Now that we know what compound
sorts and relations are, we can round up the definition of an interpretation.
An interpretation of T1 in T2 is a special function mapping the language
of T1
into that of T2. In harmony with
the slogan in the slide, this function is specified by assigning to the atomic
sorts and relations of T1 compound sorts and relations of T2. This assignment extends naturally along
applying the logical connectives from L1
to the language of T2c. Composing
this function with the royal translation back to L2 gives our function F mapping
L1 to L2.
However, this mapping qualifies as an interpretation only if the axioms of the first theory are
taken to theorems of T2. When this
requirement holds, we get a natural function going in the other direction, from
Mod(Ax2) to Mod(Ax1).
In this case we can use the perfect duality of FOL between syntax and
semantics, thus we can freely switch between talking about formulas and talking
about models.
9. This was the definition of the connections
part of the network. But what are these interpretations *really*? Well, they are exactly the structure
preserving mappings of the concepts of
T1 into those of T2. What is the
structure of concepts of a theory? In
our case it will be an algebra, we denote it by
CA(T), and we define it to be the formula-algebra of Tc. In passing, CA(T) is known as a cylindric algebra. The
interpretations are exactly the homomorphisms between the concept
algebras. In the next slide we define
formula-algebras of many-sorted theories.
10. The formula-algebra CA(T)
is the natural algebra of open formulas of a many-sorted theory T = L+Ax. It is a two-sorted algebra. The sorts-relations duality appears in it
explicitly. One sort S
contains the sort-symbols of the language L of
the theory. The other sort Fm contains the equivalence blocks of the open
formulas of the language under the
equivalence-relation “provable by the theory
Ax”. The operations of the formula-algebra are built from the logical
connectives of many-sorted FOL. “And” maps
Fm x Fm to Fm,
“Neg” maps Fm to Fm, “=ij”
maps S to Fm
and “Ei” maps S
x Fm to
Fm. The unary function “=ij”
assigns to a sort s the [equivalence block of the] formula
vis=vjs vhere vis is the i-th variable of sort s . The binary function “Ei”
assigns to a sort s and (the block of) a formula phi the (block of ) the formula phi quantified by the variable vis. This formula-algebra is a structure of
concepts of the theory. The concepts are the elements of S and Fm, and the structure is formed by the
operations.
At this point, we want to say some words about
history. Once upon a time there was no definability theory. Then Hans
Reichenbach suggested that we need one, and Alfred Tarski brought it into
existence. This definability theory was elegant and insightful. E.g.,
interpretations between theories were just assignments of open formulas to
atomic ones, and these corresponded to homomorphisms between the most natural
algebras of open formulas, the Lindenbaum-Tarski algebras. But life went on,
and in the applications new and new kinds of interpretations proved to be
useful. The picture got complicated. We interpreted a structure into only a
subset of the universe of all entities. Well, this forced us to talk about
relativizations of the formula-algebras. But then we had to interpret into a
quotient of a subset of a universe, and this really did not correspond to
anything natural in terms of formula-algebras. Oh well, we had to live with
very useful but ad-hoc kinds of interpretations. We lost elegance, we lost
understanding. But life went on in different areas of logic, too. Saharon
Shelah introduced the extension Teq of a theory T to be able to talk more
naturally about types, and Michael Makkai discovered that this extended theory
was the same as his pretopos-extension of a theory. Victor Harnik explained the
connection, as a birthday present to Michael Makkai, in a paper to be meant
both for model theorists and category theorists. We read the paper recently and
had the rare aha experience when all of a sudden everything fell into place in
the picture of a new definability theory that we had been using since 2002 in
axiomatizing relativity theory. The definitional extension that we had been
using proved to be closely related to Shelah’s and Makkai’s constructions, this
is the present Tc. The concept algebra of a theory became the Lindenbaum-Tarski
algebra of its conceptual completion, and an interpretation (of the widest
type) became again a homomorphism between the concept algebras of the two
theories in question! We are back in the
original elegant picture of Tarski, but on a higher level.
11. Though we think that
interpretations of theories are more important than their equivalence, let us
introduce equivalence of theories as a special case of interpretations. We say
that two theories are new-definitionally equivalent if the formula-algebras of
their conceptual completions are isomorphic. We call this relation also
conceptual equivalence, because we think that this name shows nicely the
content of this relation. Conceptual equivalence is a generalization of the old
definitional equivalence of Tarski, it is the same what Judit Madarasz
introduced in her dissertation in 2002, and it is the same what Barrett and
Halvorson began to call recently Morita-equivalence.
Part II. Two examples
Instead of going into further
detail about the theory of the network of theories, we try to convey how it
feels like to live in it. We, our group, have been living in this world for
quite a while by now, and our experience is that it is livable. We give you the
main ideas of two interpretations. The first one is well elaborated and is
published in a paper. The second one is a work under construction, you will see
the application in the making.
12. In the first example, we
connect two theories: a rich theory and a poor theory. The poor theory is a
theory of signals (or photons), a theory of communication, while the rich
theory is a spacetime theory, special relativity. The interpretation we
are about to show you provides an experimental, operational semantics for
special relativity. In the other direction, this interpretation shows how a
society living in the sparesome word with signaling as the only tool, can
discover time and space as useful notions for survival. Thus in our story there
is a world without time and space, hence no motion no change, and then time and
space emerge as a result of a cognitive process. We define time and space (over
timeless objects) as derived notions by using the structure of the poor world
of experimenters and signals. This amounts to a concrete mathematical
realization of Leibnizian relational view of time and space. We begin with presenting the two theories.
13. The poor theory is called
Signaling Theory, SigTh in short. It has a language where we have two sorts of
entities, experimenters (or observers) and signals (or photons), and we have
two binary relations: an experimenter can send out a signal and an experimenter
can receive a signal. So this theory is based on two binary relations between
experimenters and signals.
14. That was the language.
The axioms of Signaling Theory are all the formulas valid in its intended
model. The intended model is the following. We are in four-dimensional
Euclidean space R to the four, where R
is the set of real numbers. Experimenters are represented as straight
lines of slope less than 1 (i.e., they are more vertical than horizontal), and
signals are represented as finite nonzero segments of lines with slope 1. An experimenter sends out a signal if the
starting point of the signal lies on the line representing the experimenter,
and an experimenter receives a signal if the end point of the signal lies on
the line representing the experimenter. This is the intended model, and the axioms of SigTh are the formulas valid in
this model (in the FO language consisting of Sends and Receives).
A few words about Signaling Theory. A line
in four-dimensional space represents motion in three-dimensional space.
Experimenters represented with vertical lines are motionless and experimenters
represented with slanted lines are in motion. Our experimenters, however, have
no senses for motion, or time, or space. All what they can perceive of their
world is sending out and receiving signals. For example, an experimenter can
perceive that he sent out a signal and another experimenter e’ received it. He
also can know that the second experimenter e’ sent out another signal and he
received it. But he cannot perceive whether this second signal was sent out
earlier, later, or at the same time when the first one. In this model, from the
point of view of sending and receiving signals, all the experimenters are
alike: any one of them can be taken to any other by an automorphism. Hence, no
matter how clever our experimenters are, they cannot distinguish the motionless
among them. This is why they will discover special relativity as a spacetime
for their world, and not Newtonian absolute time.
15. A visual representation
of this intended model. Here are some motionless experimenters, they can send
out and receive signals.
16. We have running
experimenters, too, they also can send out and receive signals.
17.
However, our experimenters have no clocks and no meter rods. Yet, eventually
they will discover time and motion.
18.
Let’s turn to our rich theory. Its language has three universes or sorts. These
are the sort of observers, the sort of photons and the sort of numbers. These
three sorts are connected by a 6-place relation called worldview relation, this
represents coordinatization. The worldview relation tells us which observer
sees which bodies at what coordinates. One can view the language as consisting
of a set of coordinate systems, a coordinate system for each observer. This
language is quite convenient for talking about motion, time, space, velocity,
acceleration, etc.
19.
Here is how to draw the world-view
relation, and world-lines in coordinate systems.
20.
So, the language of SpecRel talks about a bunch of coordinate systems, one for
each observer. It does not talk directly about the house in the picture, only
about the several views of this house.
21.
We presented the language, next come the axioms. We present SpecRel by listing
its 3 axioms. Axiom1: All observers see photons move with the same speed in all
directions. (Hence there is such a notion as the speed of light.) Axiom2:All
observers see the same events (an event is meeting of two or more photons). Axiom3:The numbers are the usual.
By this we presented SpecRel. We note that
all the characteristic predictions of special relativity can be formulated and
are proved from these three axioms. E.g., moving clocks slow down. This axiom
system (or variations thereof) are well investigated in our papers. Having
introduced our two theories, now we show an interpretation between them.
22. By now we have our two theories and we
present now an interpretation between them. From SpecRel’s point of view, this
is analysing the notion of a coordinate system further. In SpecRel, a
coordinate system is an atomic notion, we do not address the question of how we
get a coordinate system in our world. The interpretation to be presented
assigns to the sort of coordinate systems an explicit definition of a compound
sort in SigTh. Such an explicit definition corresponds to a concrete experiment
in terms of experimenters sending and receiving signals. Thus this explicit
definition gives an answer to how we get/build/construct a coordinate system in
our world. In short, the interpretation provides an operational semantics for
SpecRel. From SigTh’s point of view, the interpretation shows how theoretical
notions emerge in it. The theoretical notions emerging via this interpretation
are the atomic notions of SpecRel. In this context, an explicit definition of an atomic concept of SpecRel
can be considered to be an experiment to be made in the poor world to establish
whether this notion/concept of the rich theory holds in a particular situation
or not. In the end, the experimenters may devise enough theoretical notions to
recover the whole Minkowskian spacetime, each experimenter having a private
copy of R^4 with notions of space and time. This amounts to understanding their
own world. There is a third aspect to this interpretation. In our story there
is a world without time and space, hence no motion no change, and then time and
space emerge as a result of a cognitive process. We define time and space (over
timeless objects) as derived notions by using the structure of the poor world
of experimenters and signals. This amounts to a concrete mathematical
realization of Leibnizian relational view of time and space.
An interpretation of SpecRel into SigTh
assigns to the atomic sorts Observers, Photons, and Quantities of SpecRel
compound sorts in SigTh, these latter are composed of experimenters, signals,
receives, sends. One can imagine that observers will correspond to
experimenters and photons to signals, somehow. But how shall we get quantities? We concentrate on this step.
23.
The idea for getting quantities is that first we get a geometry and from this
we get a field of quantities by Hilbert’s coordinatization method. In this
slide we define the geometry. We define the compound sort of points (or events)
as the sort of signals factorized under a relation expressing that two signals
have the same beginnings.
We
can express that the beginning of s1 is
the same as that of s2 by saying
that “each experimenter who sends s1 also sends
s2”. We define a collinearity relation between the points, by stating
that three points (events) are collinear if there is an experimenter who sent
out all three of these. More precisely, this is “timelike” collinearity, the
rest can be defined from these.
24.
We have an affine geometry over a field. So the field can be recovered from the
structure of lines via the Hilbert coordinatization method. Let’s turn to this
step. After selecting two distinct points on a line, we can define – by using
the structure of all the lines – an addition and multiplication on this line
that will be isomorphic to our original field R of the intended model. But,
which concrete one to choose of these as our quantity sort? All lines are
alike, no one of them is definable by itself. We do not know which one to
choose, so we take all of them!
This is completely analogous to the generic example where the names of lines
were pairs of distinct points. Here, the names of quantities (real numbers) are
triples of collinear points. Two such triplets
denote the same entity if there is a suitable isomorphism between the
two lines they specify.
The present definition of quantities is
analogous, we think, to the process of how natural numbers are devised. The
number 1 is an abstraction of “one
apple”, “one pear”, “one meter”, “one centimeter”, “one” of everything that we
declare to be the unit. Therefore the present way of defining new sorts of
entities can contribute to understanding in which sense abstract entities,
e.g., numbers exist in our world.
We do not continue the construction of the
interpretation. The rest of the construction can be found in our paper
“Comparing theories …” and also in the three presentations about “The end of
time”. These can be found on our homepages.
25.
The interpretation we have constructed is onto but not one-to-one. We can add
axioms to SpecRel to get SpecRel+ which is conceptually equivalent to SigTh.
These extra axioms are “simplifying” in nature, e.g., we state that if the
lifelines of two photons coincide then they are equal. This conceptual
equivalence can be used to resolve the dilemma when presenting a theory: take
few atomic notions with many axioms or
take more atomic notions with less number of axioms. Or, resolving the dilemma
between taking theoretic-oriented atomic
concepts and few elegant axioms and taking more experiment-oriented atomic
concepts.
SigTh is conceptually equivalent with
light-like separability on four-tuples of reals. Minkowski-distance is not
expressible in SigTh, but Minkowski-equidistance is. SigTh is conceptually
equivalent to Minkowski-equidistance on four-tuples of reals.
26.
Are not there too many conceptually equivalent theories? No. The theory of
fields, special relativity, Newtonian mechanics are all conceptually
nonequivalent theories.
27. As another example for an interpretation, we
want to show you our latest endeavor. This is work under construction, you will
see our first stumbling steps.
We already have a theory SpecRel for special
relativity, where we have an atomic sort for photons. These are entities that
have nothing else than their world-lines.
We now want to analyse further the notion of a photon as something as an
electromagnetic wave, so they will have energy and color. We change the
abstraction level by constructing a FOL-theory MaxWel for electrodynamics
first, and then by interpreting SpecRel into it.
28. So we want to design the
theory MaxWel. The first thing is to make it clear to ourselves what kinds of
objects we want to talk about. We want to have a coordinate-oriented approach,
since SpecRel is coordinate-oriented. So, we have a sort for
coordinate-systems, we call them also observers. With coordinate-systems a sort
of quantities comes naturally that is used as the coordinate-values in the
coordinate-systems. We also want some entities for talking about
electromagnetic waves. The most natural candidates are the electric and the
magnetic fields, E and M. However, as
they transform between the coordinate-systems in a quite non-intuitive way, we
want something more insightful. So we choose test-charges and we use the world-lines of those to define
E,M. These sorts would suffice already, but we have learnt from Marton Gomori’s
dissertation that it is better to have one more sort called the sort of
physical phenomena. These will represent different solutions of Maxwell’s
equations. Regarding to relations, basically we have only one, called the world-view relation. It connects all these
sorts in a natural way.
29. Having test-charges,
first we define E,M from them. Acceleration is equivalent with force in this
context (we assume that the test-charges have unit masses). E is defined as the
acceleration of the test-charge that accelerates from a position at rest (at
txyz) in the coordinate system [i.e., whose velocity is zero at txyz]. We can
define M in two ways. We will formulate both definitions and we will show that
they are equivalent. One of the definitions comes from stating the
Lorentz-force law together with showing that only one choice of M satisfies it.
The other definition uses the accelerations of three test-charges moving in the
directions of the three coordinate-axes
x,y,z. This other definition provides a formula for how to compute the
accelerations of all the other test-charges from these three ones.
30. Then we state the axioms
of MaxWel. Of course, the main axioms will be the two Maxwell’s equations, and
the Lorentz-force Law stated for the above defined E and M. Then we state the auxiliary axioms. E.g.,
we state axioms that enable the definitions of E,M to be meaningful. We state
as many auxiliary axioms as we need to make an advance, and later or meantime,
we can streamline and “clean” the axiom system.
The main idea of the interpretation
is to define light-like separability in MaxWel. This will be sufficient since
we know already that this is the main ingredient of SpecRel. We plan to prove
in our axiom-system that in plane-waves with a wave-front (where E,M become zero), this wave-front has to move with the
speed of light.