Adventures in the network of theories.

Andreka, H. and Nemeti, I.

 

Talk given in Salzburg Conference for Young Analytic Philosophy 2016, Sept. 7 http://www.sophia-conference.org , and also presented at the Renyi Institute of Mathematics on Sept. 25, 2016. http://www.renyi.hu/~seminar/

 

 

Below comes the full text of the talk, associated to the slides of the presentation.

 

1. Network of theories. Network of theories is an efficient tool for organizing scientific knowledge. We want to share our adventures in this network with you.

 

Part I. Theory of network of theories

 

2. We are in many-sorted first order logic with equality.

The nodes in the network are theories. A theory consists of a language and some axioms. When we design a theory, first we make clear in our minds what kinds of objects we want to talk about, Animals? Coordinate systems?, Events?, and what relations we want to use between those. These will be the atomic sorts and atomic relations of our language. We call them atomic because we do not analyse these concepts further in the language. Thus the language sets the abstraction level of our theory. After having decided about the language, we postulate the axioms of our theory. With these, the theory has been specified.  Language and Axioms.

 

3. So, the language in a theory determines the abstraction level of the theory. But the time comes when we want to change the level of abstraction of our theory. We want to analyze further the atomic sorts or relations, or just the opposite, we may want to talk on a higher level of abstraction. The acts of changing the abstraction level form the connections part of the network.

   A connection is interpreting one theory into another. This is a special function mapping the language of the first theory T1 into the language of the second one, T2. The idea is that atomic sorts and relations of the first language go to compound sorts and relations of the second one.

   Thus, an interpretation from T1’s point of view means analyzing the atomic relations further, and this corresponds to reducing T1 to T2. In the opposite direction, from T2’s point of view the interpretation means generalization. Also, an interpretation of a theory into T2 will show how theoretical notions emerge in T2, the interpretation may reveal rich structure in an otherwise simple-looking world. It may add to T2, we would say we can add understanding to T2 this way. An example for all these aspects is the interpretation that we present beginning with slide 12.

   So, the slogan is that an interpretation assigns to atomic sorts and relations compound ones. We can imagine what compound relations are: the predicate “big and not red”  is a compound relation/predicate composed of the atomic “big” and atomic “red”. But what are compound sorts?

 

4. In natural language, when we talk about people and family relations between them, we may talk about married couples as if they were entities of their own. In this case “married couples” is a new sort composed of  “people” and “to marry”.

Our generic example will be the compound sort of lines in a geometry that talks only about points and the collinearity relation. So, assume that we are in a geometry where the objects are points and we have one ternary relation, collinearity. When working in such a geometry, when trying to prove theorems, we also think about lines, in an intuitive way. But when we write down a proof in which lines occur,  we eliminate the intuitive notion of a line from the proof  since the proof is supposed to talk only about points and collinearity. What we will do now is that we introduce the compound sort of lines officially with a precise definition of  how to eliminate it as an abbreviation, and then we will not have to eliminate them from the proofs any more, since there will be an official, royal eliminating procedure. This will make our thinking more efficient.

   So, what are lines? Lines are sets of collinear points, a line can be specified by any two of its distinct points as the set of all points collinear with these two. We say that a line is coded by a pair of distinct points, any pair of distinct points name, or denote a line. Thus, pairs of distinct points are the names, or codes for lines. Let’s take a line. Any of its two distinct points will code it, and we can have no preference for which one of these pairs to choose.  Thus, the relation between codes and lines is not one-to-one. Do not panick! We define an equivalence relation on the names: two names are equivalent if they code the same line. Now there is a one-to-one correspondence between lines and equivalence blocks of this relation. This is the idea. Let us give now existence and a new home to these new entities.

 

5. The formula  epsilon  here codes the equivalence relation between names. It is an old formula, i.e., a formula in the language of points and collinearity. The axioms prove that it is symmetric and transitive. Let us add a new sort-symbol  S_eps to the language, we use it as the home of the lines. We also add a new function symbol  q_eps, for “quotient-forming”, to the language. It is a function connecting the new sort to the old ones, we call it the neck of the new sort. The situation shown in the picture, namely that  q_eps  establishes a one-to-one relation between the elements of the new sort S_eps and the blocks of  epsilon can easily be described completely by a formula that talks about  q_eps,  S_eps  and epsilon. We call that formula the explicit definition of  S_eps. This explicit definition makes it possible to eliminate the new symbols whenever we want. We add this definition as a new postulate of the extended theory.This is analogous to how we define a new relation symbol. We add its symbol to the language and we add its definition to the theory.

 

6. Here we repeat the same process in the case of an arbitrary theory  T. Let  epsilon  be any formula in the language of  T and assume that the axioms prove that epsilon is symmetric and transitive.  Then we add a new sort symbol S_eps, a new function symbol  q_eps to the language, and at the same time we extend the axioms with the explicit definition of this new sort.  Then we get greedy, and we do this for every definable partial equivalence relation  epsilon  of  T. Thus we add infinitely many new sort and relation symbols, and infinitely many new axioms. We call the theory we get  the conceptual completion of  T  and we denote it with  Tc. The name was given by Michael Makkai, and we think that it is a good name. Why? Because Tc contains all the new sorts that we can define in T. Assume that you come up with a totally different kind of new sort and convince us that it is a definition. Then we will show you that your new sort is already there in  Tc. (There is a theorem to this effect, a Beth-type theorem.) Finally, there is a royal translation function from  Tc  to  T.

   Summing up, in constructing  Tc  from  T,  for every definable partial equivalence relation  epsilon  we add  S_eps  and  q_eps  to the language of  T (together with the corresponding defining formulas). We allowed empty sorts.

 

7.  We constructed Tc to be able to define what compound sorts are. We define compound sorts of  T to be the sorts of  Tc.  Compound relations of  T  are the open formulas of  Tc.

 

8. Now that we know what compound sorts and relations are, we can round up the definition of an interpretation. An interpretation of  T1  in  T2  is a special function mapping the language of  T1  into that of  T2. In harmony with the slogan in the slide, this function is specified by assigning to the atomic sorts and relations of T1 compound sorts and relations of  T2. This assignment extends naturally along applying the logical connectives from L1  to  the language of T2c. Composing this function with the royal translation back to L2 gives our function  F mapping  L1  to  L2.  However, this mapping qualifies as an interpretation  only if the axioms of the first theory are taken to theorems  of T2. When this requirement holds, we get a natural function going in the other direction,  from  Mod(Ax2)  to  Mod(Ax1).  In this case we can use the perfect duality of FOL between syntax and semantics, thus we can freely switch between talking about formulas and talking about models.

 

9.  This was the definition of the connections part of the network. But what are these interpretations *really*?  Well, they are exactly the structure preserving mappings of the concepts of  T1 into those of  T2. What is the structure of concepts of a theory?  In our case it will be an algebra, we denote it by  CA(T), and we define it to be the formula-algebra of  Tc. In passing,  CA(T) is known as a cylindric algebra. The interpretations are exactly the homomorphisms between the concept algebras.   In the next slide we define formula-algebras of many-sorted theories.

 

10. The formula-algebra  CA(T)  is the natural algebra of open formulas of a many-sorted theory  T =  L+Ax.  It is a two-sorted algebra. The sorts-relations duality appears in it explicitly.  One sort  S  contains the sort-symbols of the language  L  of the theory. The other sort  Fm  contains the equivalence blocks of the open formulas of the language under  the equivalence-relation “provable by the theory  Ax”. The operations of the formula-algebra are built from the logical connectives of many-sorted FOL. “And” maps  Fm x Fm  to  Fm,  “Neg” maps  Fm to Fm,  “=ij”  maps  S  to Fm  and  “Ei”  maps  S x Fm  to  Fm.   The unary function  “=ij”  assigns to a sort  s  the [equivalence block of the] formula vis=vjs vhere  vis  is the i-th variable of sort  s . The binary function  “Ei”  assigns to a sort  s  and (the block of) a formula  phi the (block of ) the formula  phi quantified by the variable  vis. This formula-algebra is a structure of concepts of the theory. The concepts are the elements of  S  and  Fm, and the structure is formed by the operations.

 

   At this point, we wantereHe       to say some words about history. Once upon a time there was no definability theory. Then Hans Reichenbach suggested that we need one, and Alfred Tarski brought it into existence. This definability theory was elegant and insightful. E.g., interpretations between theories were just assignments of open formulas to atomic ones, and these corresponded to homomorphisms between the most natural algebras of open formulas, the Lindenbaum-Tarski algebras. But life went on, and in the applications new and new kinds of interpretations proved to be useful. The picture got complicated. We interpreted a structure into only a subset of the universe of all entities. Well, this forced us to talk about relativizations of the formula-algebras. But then we had to interpret into a quotient of a subset of a universe, and this really did not correspond to anything natural in terms of formula-algebras. Oh well, we had to live with very useful but ad-hoc kinds of interpretations. We lost elegance, we lost understanding. But life went on in different areas of logic, too. Saharon Shelah introduced the extension Teq of a theory T to be able to talk more naturally about types, and Michael Makkai discovered that this extended theory was the same as his pretopos-extension of a theory. Victor Harnik explained the connection, as a birthday present to Michael Makkai, in a paper to be meant both for model theorists and category theorists. We read the paper recently and had the rare aha experience when all of a sudden everything fell into place in the picture of a new definability theory that we had been using since 2002 in axiomatizing relativity theory. The definitional extension that we had been using proved to be closely related to Shelah’s and Makkai’s constructions, this is the present Tc. The concept algebra of a theory became the Lindenbaum-Tarski algebra of its conceptual completion, and an interpretation (of the widest type) became again a homomorphism between the concept algebras of the two theories in question!  We are back in the original elegant picture of Tarski, but on a higher level.

 

11. Though we think that interpretations of theories are more important than their equivalence, let us introduce equivalence of theories as a special case of interpretations. We say that two theories are new-definitionally equivalent if the formula-algebras of their conceptual completions are isomorphic. We call this relation also conceptual equivalence, because we think that this name shows nicely the content of this relation. Conceptual equivalence is a generalization of the old definitional equivalence of Tarski, it is the same what Judit Madarasz introduced in her dissertation in 2002, and it is the same what Barrett and Halvorson began to call recently Morita-equivalence.

 

Part II. Two examples

 

Instead of going into further detail about the theory of the network of theories, we try to convey how it feels like to live in it. We, our group, have been living in this world for quite a while by now, and our experience is that it is livable. We give you the main ideas of two interpretations. The first one is well elaborated and is published in a paper. The second one is a work under construction, you will see the application in the making. 

 

12. In the first example, we connect two theories: a rich theory and a poor theory. The poor theory is a theory of signals (or photons), a theory of communication, while the rich theory is a spacetime theory, special relativity. The interpretation we are about to show you provides an experimental, operational semantics for special relativity. In the other direction, this interpretation shows how a society living in the sparesome word with signaling as the only tool, can discover time and space as useful notions for survival. Thus in our story there is a world without time and space, hence no motion no change, and then time and space emerge as a result of a cognitive process. We define time and space (over timeless objects) as derived notions by using the structure of the poor world of experimenters and signals. This amounts to a concrete mathematical realization of Leibnizian relational view of time and space.  We begin with presenting the two theories.

 

13. The poor theory is called Signaling Theory, SigTh in short. It has a language where we have two sorts of entities, experimenters (or observers) and signals (or photons), and we have two binary relations: an experimenter can send out a signal and an experimenter can receive a signal. So this theory is based on two binary relations between experimenters and signals.

 

14. That was the language. The axioms of Signaling Theory are all the formulas valid in its intended model. The intended model is the following. We are in four-dimensional Euclidean space  R to the four,  where R  is the set of real numbers. Experimenters are represented as straight lines of slope less than 1 (i.e., they are more vertical than horizontal), and signals are represented as finite nonzero segments of lines with slope 1.  An experimenter sends out a signal if the starting point of the signal lies on the line representing the experimenter, and an experimenter receives a signal if the end point of the signal lies on the line representing the experimenter. This is the intended model, and  the axioms of SigTh are the formulas valid in this model (in the FO language consisting of Sends and Receives).

    A few words about Signaling Theory. A line in four-dimensional space represents motion in three-dimensional space. Experimenters represented with vertical lines are motionless and experimenters represented with slanted lines are in motion. Our experimenters, however, have no senses for motion, or time, or space. All what they can perceive of their world is sending out and receiving signals. For example, an experimenter can perceive that he sent out a signal and another experimenter e’ received it. He also can know that the second experimenter e’ sent out another signal and he received it. But he cannot perceive whether this second signal was sent out earlier, later, or at the same time when the first one. In this model, from the point of view of sending and receiving signals, all the experimenters are alike: any one of them can be taken to any other by an automorphism. Hence, no matter how clever our experimenters are, they cannot distinguish the motionless among them. This is why they will discover special relativity as a spacetime for their world, and not Newtonian absolute time.

 

15. A visual representation of this intended model. Here are some motionless experimenters, they can send out and receive signals.

 

16. We have running experimenters, too, they also can send out and receive signals.

 

17. However, our experimenters have no clocks and no meter rods. Yet, eventually they will discover time and motion.

 

18. Let’s turn to our rich theory. Its language has three universes or sorts. These are the sort of observers, the sort of photons and the sort of numbers. These three sorts are connected by a 6-place relation called worldview relation, this represents coordinatization. The worldview relation tells us which observer sees which bodies at what coordinates. One can view the language as consisting of a set of coordinate systems, a coordinate system for each observer. This language is quite convenient for talking about motion, time, space, velocity, acceleration, etc.

 

19. Here is  how to draw the world-view relation, and world-lines in coordinate systems.

 

20. So, the language of SpecRel talks about a bunch of coordinate systems, one for each observer. It does not talk directly about the house in the picture, only about the several views of this house.

 

21. We presented the language, next come the axioms. We present SpecRel by listing its 3 axioms. Axiom1: All observers see photons move with the same speed in all directions. (Hence there is such a notion as the speed of light.) Axiom2:All observers see the same events (an event is meeting of two or more photons).  Axiom3:The numbers are the usual.

     By this we presented SpecRel. We note that all the characteristic predictions of special relativity can be formulated and are proved from these three axioms. E.g., moving clocks slow down. This axiom system (or variations thereof) are well investigated in our papers. Having introduced our two theories, now we show an interpretation between them.

 

22.  By now we have our two theories and we present now an interpretation between them. From SpecRel’s point of view, this is analysing the notion of a coordinate system further. In SpecRel, a coordinate system is an atomic notion, we do not address the question of how we get a coordinate system in our world. The interpretation to be presented assigns to the sort of coordinate systems an explicit definition of a compound sort in SigTh. Such an explicit definition corresponds to a concrete experiment in terms of experimenters sending and receiving signals. Thus this explicit definition gives an answer to how we get/build/construct a coordinate system in our world. In short, the interpretation provides an operational semantics for SpecRel. From SigTh’s point of view, the interpretation shows how theoretical notions emerge in it. The theoretical notions emerging via this interpretation are the atomic notions of SpecRel. In this context, an explicit definition of an atomic concept of SpecRel can be considered to be an experiment to be made in the poor world to establish whether this notion/concept of the rich theory holds in a particular situation or not. In the end, the experimenters may devise enough theoretical notions to recover the whole Minkowskian spacetime, each experimenter having a private copy of R^4 with notions of space and time. This amounts to understanding their own world. There is a third aspect to this interpretation. In our story there is a world without time and space, hence no motion no change, and then time and space emerge as a result of a cognitive process. We define time and space (over timeless objects) as derived notions by using the structure of the poor world of experimenters and signals. This amounts to a concrete mathematical realization of Leibnizian relational view of time and space.

    An interpretation of SpecRel into SigTh assigns to the atomic sorts Observers, Photons, and Quantities of SpecRel compound sorts in SigTh, these latter are composed of experimenters, signals, receives, sends. One can imagine that observers will correspond to experimenters and photons to signals, somehow. But how shall we get quantities?  We concentrate on this step.

 

23. The idea for getting quantities is that first we get a geometry and from this we get a field of quantities by Hilbert’s coordinatization method. In this slide we define the geometry. We define the compound sort of points (or events) as the sort of signals factorized under a relation expressing that two signals have the same beginnings.

We can express that the beginning of  s1  is the same as that of  s2 by saying that “each experimenter who sends  s1  also sends  s2”. We define a collinearity relation between the points, by stating that three points (events) are collinear if there is an experimenter who sent out all three of these. More precisely, this is “timelike” collinearity, the rest can be defined from these.

 

24. We have an affine geometry over a field. So the field can be recovered from the structure of lines via the Hilbert coordinatization method. Let’s turn to this step. After selecting two distinct points on a line, we can define – by using the structure of all the lines – an addition and multiplication on this line that will be isomorphic to our original field R of the intended model. But, which concrete one to choose of these as our quantity sort? All lines are alike, no one of them is definable by itself. We do not know which one to choose, so we take all of them! This is completely analogous to the generic example where the names of lines were pairs of distinct points. Here, the names of quantities (real numbers) are triples of collinear points. Two such triplets  denote the same entity if there is a suitable isomorphism between the two lines they specify.

   The present definition of quantities is analogous, we think, to the process of how natural numbers are devised. The number 1 is an abstraction of  “one apple”, “one pear”, “one meter”, “one centimeter”, “one” of everything that we declare to be the unit. Therefore the present way of defining new sorts of entities can contribute to understanding in which sense abstract entities, e.g., numbers exist in our world.

   We do not continue the construction of the interpretation. The rest of the construction can be found in our paper “Comparing theories …” and also in the three presentations about “The end of time”. These can be found on our homepages.

 

25. The interpretation we have constructed is onto but not one-to-one. We can add axioms to SpecRel to get SpecRel+ which is conceptually equivalent to SigTh. These extra axioms are “simplifying” in nature, e.g., we state that if the lifelines of two photons coincide then they are equal. This conceptual equivalence can be used to resolve the dilemma when presenting a theory: take few atomic notions with many axioms  or take more atomic notions with less number of axioms. Or, resolving the dilemma between taking  theoretic-oriented atomic concepts and few elegant axioms  and  taking more experiment-oriented atomic concepts.

    SigTh is conceptually equivalent with light-like separability on four-tuples of reals. Minkowski-distance is not expressible in SigTh, but Minkowski-equidistance is. SigTh is conceptually equivalent to Minkowski-equidistance on four-tuples of reals.

 

26. Are not there too many conceptually equivalent theories? No. The theory of fields, special relativity, Newtonian mechanics are all conceptually nonequivalent theories.

 

27.  As another example for an interpretation, we want to show you our latest endeavor. This is work under construction, you will see our first stumbling steps.

   We already have a theory SpecRel for special relativity, where we have an atomic sort for photons. These are entities that have nothing else than their world-lines.  We now want to analyse further the notion of a photon as something as an electromagnetic wave, so they will have energy and color. We change the abstraction level by constructing a FOL-theory MaxWel for electrodynamics first, and then by interpreting SpecRel into it.

 

28. So we want to design the theory MaxWel. The first thing is to make it clear to ourselves what kinds of objects we want to talk about. We want to have a coordinate-oriented approach, since SpecRel is coordinate-oriented. So, we have a sort for coordinate-systems, we call them also observers. With coordinate-systems a sort of quantities comes naturally that is used as the coordinate-values in the coordinate-systems. We also want some entities for talking about electromagnetic waves. The most natural candidates are the electric and the magnetic fields,  E and M. However, as they transform between the coordinate-systems in a quite non-intuitive way, we want something more insightful. So we choose test-charges and  we use the world-lines of those to define E,M. These sorts would suffice already, but we have learnt from Marton Gomori’s dissertation that it is better to have one more sort called the sort of physical phenomena. These will represent different solutions of Maxwell’s equations. Regarding to relations, basically we have only one, called  the world-view relation. It connects all these sorts in a natural way.

 

29. Having test-charges, first we define E,M from them. Acceleration is equivalent with force in this context (we assume that the test-charges have unit masses). E is defined as the acceleration of the test-charge that accelerates from a position at rest (at txyz) in the coordinate system [i.e., whose velocity is zero at txyz]. We can define M in two ways. We will formulate both definitions and we will show that they are equivalent. One of the definitions comes from stating the Lorentz-force law together with showing that only one choice of M satisfies it. The other definition uses the accelerations of three test-charges moving in the directions of the three coordinate-axes  x,y,z. This other definition provides a formula for how to compute the accelerations of all the other test-charges from these three ones.

 

30. Then we state the axioms of MaxWel. Of course, the main axioms will be the two Maxwell’s equations, and the Lorentz-force Law stated for the above defined  E  and  M. Then we state the auxiliary axioms. E.g., we state axioms that enable the definitions of E,M to be meaningful. We state as many auxiliary axioms as we need to make an advance, and later or meantime, we can streamline and “clean” the axiom system.

The main idea of the interpretation is to define light-like separability in MaxWel. This will be sufficient since we know already that this is the main ingredient of SpecRel. We plan to prove in our axiom-system that in plane-waves with a wave-front (where E,M become  zero), this wave-front has to move with the speed of light.