Nobel Lecture, December 11,
1965
We have a habit in writing articles
published in scientific journals to make the work as finished as
possible, to cover all the tracks, to not worry about the blind
alleys or to describe how you had the wrong idea first, and so
on. So there isn't any place to publish, in a dignified manner,
what you actually did in order to get to do the work, although,
there has been in these days, some interest in this kind of
thing. Since winning the prize is a personal thing, I thought I
could be excused in this particular situation, if I were to talk
personally about my relationship to quantum electrodynamics,
rather than to discuss the subject itself in a refined and
finished fashion. Furthermore, since there are three people who
have won the prize in physics, if they are all going to be
talking about quantum electrodynamics itself, one might become
bored with the subject. So, what I would like to tell you about
today are the sequence of events, really the sequence of ideas,
which occurred, and by which I finally came out the other end
with an unsolved problem for which I ultimately received a
prize.
I realize that a truly scientific paper
would be of greater value, but such a paper I could publish in
regular journals. So, I shall use this Nobel Lecture as an
opportunity to do something of less value, but which I cannot do
elsewhere. I ask your indulgence in another manner. I shall
include details of anecdotes which are of no value either
scientifically, nor for understanding the development of ideas.
They are included only to make the lecture more entertaining.
I worked on this problem about eight years
until the final publication in 1947. The beginning of the thing
was at the Massachusetts Institute of Technology, when I was an
undergraduate student reading about the known physics, learning
slowly about all these things that people were worrying about,
and realizing ultimately that the fundamental problem of the day
was that the quantum theory of electricity and magnetism was not
completely satisfactory. This I gathered from books like those of
Heitler and Dirac. I was inspired by
the remarks in these books; not by the parts in which everything
was proved and demonstrated carefully and calculated, because I
couldn't understand those very well. At the young age what I
could understand were the remarks about the fact that this
doesn't make any sense, and the last sentence of the book of
Dirac I can still remember, "It seems that some essentially new
physical ideas are here needed." So, I had this as a challenge
and an inspiration. I also had a personal feeling, that since
they didn't get a satisfactory answer to the problem I wanted to
solve, I don't have to pay a lot of attention to what they did
do.
I did gather from my readings, however,
that two things were the source of the difficulties with the
quantum electrodynamical theories. The first was an infinite
energy of interaction of the electron with itself. And this
difficulty existed even in the classical theory. The other
difficulty came from some infinites which had to do with the
infinite numbers of degrees of freedom in the field. As I
understood it at the time (as nearly as I can remember) this was
simply the difficulty that if you quantized the harmonic
oscillators of the field (say in a box) each oscillator has a
ground state energy of (½) and there is an infinite number of modes in a box of every
increasing frequency w, and
therefore there is an infinite energy in the box. I now realize
that that wasn't a completely correct statement of the central
problem; it can be removed simply by changing the zero from which
energy is measured. At any rate, I believed that the difficulty
arose somehow from a combination of the electron acting on itself
and the infinite number of degrees of freedom of the field.
Well, it seemed to me quite evident that
the idea that a particle acts on itself, that the electrical
force acts on the same particle that generates it, is not a
necessary one - it is a sort of a silly one, as a matter of fact.
And, so I suggested to myself, that electrons cannot act on
themselves, they can only act on other electrons. That means
there is no field at all. You see, if all charges contribute to
making a single common field, and if that common field acts back
on all the charges, then each charge must act back on itself.
Well, that was where the mistake was, there was no field. It was
just that when you shook one charge, another would shake later.
There was a direct interaction between charges, albeit with a
delay. The law of force connecting the motion of one charge with
another would just involve a delay. Shake this one, that one
shakes later. The sun atom shakes; my eye electron shakes eight
minutes later, because of a direct interaction across.
Now, this has the attractive feature that
it solves both problems at once. First, I can say immediately, I
don't let the electron act on itself, I just let this act on
that, hence, no self-energy! Secondly, there is not an infinite
number of degrees of freedom in the field. There is no field at
all; or if you insist on thinking in terms of ideas like that of
a field, this field is always completely determined by the action
of the particles which produce it. You shake this particle, it
shakes that one, but if you want to think in a field way, the
field, if it's there, would be entirely determined by the matter
which generates it, and therefore, the field does not have any
independent degrees of freedom and the infinities from the
degrees of freedom would then be removed. As a matter of fact,
when we look out anywhere and see light, we can always "see" some
matter as the source of the light. We don't just see light
(except recently some radio reception has been found with no
apparent material source).
You see then that my general plan was to
first solve the classical problem, to get rid of the infinite
self-energies in the classical theory, and to hope that when I
made a quantum theory of it, everything would just be fine.
That was the beginning, and the idea seemed
so obvious to me and so elegant that I fell deeply in love with
it. And, like falling in love with a woman, it is only possible
if you do not know much about her, so you cannot see her faults.
The faults will become apparent later, but after the love is
strong enough to hold you to her. So, I was held to this theory,
in spite of all difficulties, by my youthful enthusiasm.
Then I went to graduate school and
somewhere along the line I learned what was wrong with the idea
that an electron does not act on itself. When you accelerate an
electron it radiates energy and you have to do extra work to
account for that energy. The extra force against which this work
is done is called the force of radiation resistance. The origin
of this extra force was identified in those days, following
Lorentz, as the action of the electron itself. The first term of
this action, of the electron on itself, gave a kind of inertia
(not quite relativistically satisfactory). But that inertia-like
term was infinite for a point-charge. Yet the next term in the
sequence gave an energy loss rate, which for a point-charge
agrees exactly with the rate you get by calculating how much
energy is radiated. So, the force of radiation resistance, which
is absolutely necessary for the conservation of energy would
disappear if I said that a charge could not act on itself.
So, I learned in the interim when I went to
graduate school the glaringly obvious fault of my own theory.
But, I was still in love with the original theory, and was still
thinking that with it lay the solution to the difficulties of
quantum electrodynamics. So, I continued to try on and off to
save it somehow. I must have some action develop on a given
electron when I accelerate it to account for radiation
resistance. But, if I let electrons only act on other electrons
the only possible source for this action is another electron in
the world. So, one day, when I was working for Professor Wheeler
and could no longer solve the problem that he had given me, I
thought about this again and I calculated the following. Suppose
I have two charges - I shake the first charge, which I think of
as a source and this makes the second one shake, but the second
one shaking produces an effect back on the source. And so, I
calculated how much that effect back on the first charge was,
hoping it might add up the force of radiation resistance. It
didn't come out right, of course, but I went to Professor Wheeler
and told him my ideas. He said, - yes, but the answer you get for
the problem with the two charges that you just mentioned will,
unfortunately, depend upon the charge and the mass of the second
charge and will vary inversely as the square of the distance
R, between the charges, while the force of radiation
resistance depends on none of these things. I thought, surely, he
had computed it himself, but now having become a professor, I
know that one can be wise enough to see immediately what some
graduate student takes several weeks to develop. He also pointed
out something that also bothered me, that if we had a situation
with many charges all around the original source at roughly
uniform density and if we added the effect of all the surrounding
charges the inverse R square would be compensated by the
R2 in the volume element and we would get a
result proportional to the thickness of the layer, which would go
to infinity. That is, one would have an infinite total effect
back at the source. And, finally he said to me, and you forgot
something else, when you accelerate the first charge, the second
acts later, and then the reaction back here at the source would
be still later. In other words, the action occurs at the wrong
time. I suddenly realized what a stupid fellow I am, for what I
had described and calculated was just ordinary reflected light,
not radiation reaction.
But, as I was stupid, so was Professor
Wheeler that much more clever. For he then went on to give a
lecture as though he had worked this all out before and was
completely prepared, but he had not, he worked it out as he went
along. First, he said, let us suppose that the return action by
the charges in the absorber reaches the source by advanced waves
as well as by the ordinary retarded waves of reflected light; so
that the law of interaction acts backward in time, as well as
forward in time. I was enough of a physicist at that time not to
say, "Oh, no, how could that be?" For today all physicists know
from studying Einstein and Bohr, that sometimes an
idea which looks completely paradoxical at first, if analyzed to
completion in all detail and in experimental situations, may, in
fact, not be paradoxical. So, it did not bother me any more than
it bothered Professor Wheeler to use advance waves for the back
reaction - a solution of Maxwell's equations, which previously
had not been physically used.
Professor Wheeler used advanced waves to
get the reaction back at the right time and then he suggested
this: If there were lots of electrons in the absorber, there
would be an index of refraction n, so, the retarded waves
coming from the source would have their wave lengths slightly
modified in going through the absorber. Now, if we shall assume
that the advanced waves come back from the absorber without an
index - why? I don't know, let's assume they come back without an
index - then, there will be a gradual shifting in phase between
the return and the original signal so that we would only have to
figure that the contributions act as if they come from only a
finite thickness, that of the first wave zone. (More
specifically, up to that depth where the phase in the medium is
shifted appreciably from what it would be in vacuum, a thickness
proportional to l/(n-1). ) Now,
the less the number of electrons in here, the less each
contributes, but the thicker will be the layer that effectively
contributes because with less electrons, the index differs less
from 1. The higher the charges of these electrons, the more each
contribute, but the thinner the effective layer, because the
index would be higher. And when we estimated it, (calculated
without being careful to keep the correct numerical factor) sure
enough, it came out that the action back at the source was
completely independent of the properties of the charges that were
in the surrounding absorber. Further, it was of just the right
character to represent radiation resistance, but we were unable
to see if it was just exactly the right size. He sent me home
with orders to figure out exactly how much advanced and how much
retarded wave we need to get the thing to come out numerically
right, and after that, figure out what happens to the advanced
effects that you would expect if you put a test charge here close
to the source? For if all charges generate advanced, as well as
retarded effects, why would that test not be affected by the
advanced waves from the source?
I found that you get the right answer if
you use half-advanced and half-retarded as the field generated by
each charge. That is, one is to use the solution of Maxwell's
equation which is symmetrical in time and that the reason we got
no advanced effects at a point close to the source in spite of
the fact that the source was producing an advanced field is this.
Suppose the source s surrounded by a spherical absorbing
wall ten light seconds away, and that the test charge is one
second to the right of the source. Then the source is as much as
eleven seconds away from some parts of the wall and only nine
seconds away from other parts. The source acting at time
t=0 induces motions in the wall at time +10. Advanced
effects from this can act on the test charge as early as eleven
seconds earlier, or at t= -1. This is just at the time
that the direct advanced waves from the source should reach the
test charge, and it turns out the two effects are exactly equal
and opposite and cancel out! At the later time +1 effects on the
test charge from the source and from the walls are again equal,
but this time are of the same sign and add to convert the
half-retarded wave of the source to full retarded strength.
Thus, it became clear that there was the
possibility that if we assume all actions are via
half-advanced and half-retarded solutions of Maxwell's equations
and assume that all sources are surrounded by material absorbing
all the the light which is emitted, then we could account for
radiation resistance as a direct action of the charges of the
absorber acting back by advanced waves on the source.
Many months were devoted to checking all
these points. I worked to show that everything is independent of
the shape of the container, and so on, that the laws are exactly
right, and that the advanced effects really cancel in every case.
We always tried to increase the efficiency of our demonstrations,
and to see with more and more clarity why it works. I won't bore
you by going through the details of this. Because of our using
advanced waves, we also had many apparent paradoxes, which we
gradually reduced one by one, and saw that there was in fact no
logical difficulty with the theory. It was perfectly
satisfactory.
We also found that we could reformulate
this thing in another way, and that is by a principle of least
action. Since my original plan was to describe everything
directly in terms of particle motions, it was my desire to
represent this new theory without saying anything about fields.
It turned out that we found a form for an action directly
involving the motions of the charges only, which upon variation
would give the equations of motion of these charges. The
expression for this action A is
where
where is the four-vector position of the ith particle as a
function of some parameter . The first term is the integral of proper time, the
ordinary action of relativistic mechanics of free particles of
mass mi. (We sum in the usual way on the
repeated index m.) The second term
represents the electrical interaction of the charges. It is
summed over each pair of charges (the factor ½ is to count
each pair once, the term i=j is omitted to avoid
self-action) .The interaction is a double integral over a delta
function of the square of space-time interval
I2 between two points on the paths. Thus,
interaction occurs only when this interval vanishes, that is,
along light cones.
The fact that the interaction is exactly
one-half advanced and half-retarded meant that we could write
such a principle of least action, whereas interaction via
retarded waves alone cannot be written in such a way.
So, all of classical electrodynamics was
contained in this very simple form. It looked good, and
therefore, it was undoubtedly true, at least to the beginner. It
automatically gave half-advanced and half-retarded effects and it
was without fields. By omitting the term in the sum when
i=j, I omit self-interaction and no longer
have any infinite self-energy. This then was the hoped-for
solution to the problem of ridding classical electrodynamics of
the infinities.
It turns out, of course, that you can
reinstate fields if you wish to, but you have to keep track of
the field produced by each particle separately. This is because
to find the right field to act on a given particle, you must
exclude the field that it creates itself. A single universal
field to which all contribute will not do. This idea had been
suggested earlier by Frenkel and so we called these Frenkel
fields. This theory which allowed only particles to act on each
other was equivalent to Frenkel's fields using half-advanced and
half-retarded solutions.
There were several suggestions for
interesting modifications of electrodynamics. We discussed lots
of them, but I shall report on only one. It was to replace this
delta function in the interaction by another function, say,
f(I2ij), which is not
infinitely sharp. Instead of having the action occur only when
the interval between the two charges is exactly zero, we would
replace the delta function of I2 by a narrow
peaked thing. Let's say that f(Z) is large only
near Z=0 width of order a2. Interactions
will now occur when T2-R2 is of
order a2 roughly where T is the time
difference and R is the separation of the charges. This
might look like it disagrees with experience, but if a is
some small distance, like 10-13 cm, it says that the
time delay T in action is roughly or approximately, - if R is much larger than
a, T=R±a2/2R. This means
that the deviation of time T from the ideal theoretical
time R of Maxwell, gets smaller and smaller, the further
the pieces are apart. Therefore, all theories involving in
analyzing generators, motors, etc., in fact, all of the tests of
electrodynamics that were available in Maxwell's time, would be
adequately satisfied if were 10-13 cm. If R is
of the order of a centimeter this deviation in T is only
10-26 parts. So, it was possible, also, to change the
theory in a simple manner and to still agree with all
observations of classical electrodynamics. You have no clue of
precisely what function to put in for f, but it was an
interesting possibility to keep in mind when developing quantum
electrodynamics.
It also occurred to us that if we did that
(replace d by f) we could not
reinstate the term i=j in the sum because this
would now represent in a relativistically invariant fashion a
finite action of a charge on itself. In fact, it was possible to
prove that if we did do such a thing, the main effect of the
self-action (for not too rapid accelerations) would be to produce
a modification of the mass. In fact, there need be no mass
mi, term, all the mechanical mass could be
electromagnetic self-action. So, if you would like, we could also
have another theory with a still simpler expression for the
action A. In expression (1) only the second term is kept,
the sum extended over all i and j, and some
function replaces d. Such a simple
form could represent all of classical electrodynamics, which
aside from gravitation is essentially all of classical
physics.
Although it may sound confusing, I am
describing several different alternative theories at once. The
important thing to note is that at this time we had all these in
mind as different possibilities. There were several possible
solutions of the difficulty of classical electrodynamics, any one
of which might serve as a good starting point to the solution of
the difficulties of quantum electrodynamics.
I would also like to emphasize that by this
time I was becoming used to a physical point of view different
from the more customary point of view. In the customary view,
things are discussed as a function of time in very great detail.
For example, you have the field at this moment, a differential
equation gives you the field at the next moment and so on; a
method, which I shall call the Hamilton method, the time
differential method. We have, instead (in (1) say) a thing that
describes the character of the path throughout all of space and
time. The behavior of nature is determined by saying her whole
spacetime path has a certain character. For an action like (1)
the equations obtained by variation (of Xim (ai)) are no longer at all easy
to get back into Hamiltonian form. If you wish to use as
variables only the coordinates of particles, then you can talk
about the property of the paths - but the path of one particle at
a given time is affected by the path of another at a different
time. If you try to describe, therefore, things differentially,
telling what the present conditions of the particles are, and how
these present conditions will affect the future you see, it is
impossible with particles alone, because something the particle
did in the past is going to affect the future.
Therefore, you need a lot of bookkeeping
variables to keep track of what the particle did in the past.
These are called field variables. You will, also, have to tell
what the field is at this present moment, if you are to be able
to see later what is going to happen. From the overall space-time
view of the least action principle, the field disappears as
nothing but bookkeeping variables insisted on by the Hamiltonian
method.
As a by-product of this same view, I
received a telephone call one day at the graduate college at
Princeton from
Professor (John Archibald) Wheeler, in which he said,
"Feynman, I know why all electrons have the same charge and the same mass"
"Why?" "Because, they are all the same electron!" And, then he
explained on the telephone, "suppose that the world lines which
we were ordinarily considering before in time and space - instead
of only going up in time were a tremendous knot, and then, when
we cut through the knot, by the plane corresponding to a fixed
time, we would see many, many world lines and that would
represent many electrons, except for one thing. If in one section
this is an ordinary electron world line, in the section in which
it reversed itself and is coming back from the future we have the
wrong sign to the proper time - to the proper four velocities -
and that's equivalent to changing the sign of the charge, and,
therefore, that part of a path would act like a positron." "But,
Professor", I said, "there aren't as many positrons as
electrons." "Well, maybe they are hidden in the protons or
something", he said. I did not take the idea that all the
electrons were the same one from him as seriously as I took the
observation that positrons could simply be represented as
electrons going from the future to the past in a back section of
their world lines. That, I stole!
To summarize, when I was done with this, as
a physicist I had gained two things. One, I knew many different
ways of formulating classical electrodynamics, with many
different mathematical forms. I got to know how to express the
subject every which way. Second, I had a point of view - the
overall space-time point of view - and a disrespect for the
Hamiltonian method of describing physics.
I would like to interrupt here to make a
remark. The fact that electrodynamics can be written in so many
ways - the differential equations of Maxwell, various minimum
principles with fields, minimum principles without fields, all
different kinds of ways, was something I knew, but I have never
understood. It always seems odd to me that the fundamental laws
of physics, when discovered, can appear in so many different
forms that are not apparently identical at first, but, with a
little mathematical fiddling you can show the relationship. An
example of that is the Schrödinger
equation and the Heisenberg formulation
of quantum mechanics. I don't know why this is - it remains a
mystery, but it was something I learned from experience. There is
always another way to say the same thing that doesn't look at all
like the way you said it before. I don't know what the reason for
this is. I think it is somehow a representation of the simplicity
of nature. A thing like the inverse square law is just right to
be represented by the solution of Poisson's equation, which,
therefore, is a very different way to say the same thing that
doesn't look at all like the way you said it before. I don't know
what it means, that nature chooses these curious forms, but maybe
that is a way of defining simplicity. Perhaps a thing is simple
if you can describe it fully in several different ways without
immediately knowing that you are describing the same thing.
I was now convinced that since we had
solved the problem of classical electrodynamics (and completely
in accordance with my program from M.I.T., only direct
interaction between particles, in a way that made fields
unnecessary) that everything was definitely going to be all
right. I was convinced that all I had to do was make a quantum
theory analogous to the classical one and everything would be
solved.
So, the problem is only to make a quantum
theory, which has as its classical analog, this expression (1).
Now, there is no unique way to make a quantum theory from
classical mechanics, although all the textbooks make believe
there is. What they would tell you to do, was find the momentum
variables and replace them by , but I couldn't find a momentum variable, as there wasn't
any.
The character of quantum mechanics of the
day was to write things in the famous Hamiltonian way - in the
form of a differential equation, which described how the wave
function changes from instant to instant, and in terms of an
operator, H. If the classical physics could be reduced to
a Hamiltonian form, everything was all right. Now, least action
does not imply a Hamiltonian form if the action is a function of
anything more than positions and velocities at the same moment.
If the action is of the form of the integral of a function,
(usually called the Lagrangian) of the velocities and positions
at the same time
then you can start with the Lagrangian and
then create a Hamiltonian and work out the quantum mechanics,
more or less uniquely. But this thing (1) involves the key
variables, positions, at two different times and therefore, it
was not obvious what to do to make the quantum-mechanical
analogue.
I tried - I would struggle in various ways.
One of them was this; if I had harmonic oscillators interacting
with a delay in time, I could work out what the normal modes were
and guess that the quantum theory of the normal modes was the
same as for simple oscillators and kind of work my way back in
terms of the original variables. I succeeded in doing that, but I
hoped then to generalize to other than a harmonic oscillator, but
I learned to my regret something, which many people have learned.
The harmonic oscillator is too simple; very often you can work
out what it should do in quantum theory without getting much of a
clue as to how to generalize your results to other systems.
So that didn't help me very much, but when
I was struggling with this problem, I went to a beer party in the
Nassau Tavern in Princeton. There was a gentleman, newly arrived
from Europe (Herbert Jehle) who came and sat next to me.
Europeans are much more serious than we are in America because
they think that a good place to discuss intellectual matters is a
beer party. So, he sat by me and asked, "what are you doing" and
so on, and I said, "I'm drinking beer." Then I realized that he
wanted to know what work I was doing and I told him I was
struggling with this problem, and I simply turned to him and
said, "listen, do you know any way of doing quantum mechanics,
starting with action - where the action integral comes into the
quantum mechanics?" "No", he said, "but Dirac has a paper in
which the Lagrangian, at least, comes into quantum mechanics. I
will show it to you tomorrow."
Next day we went to the Princeton Library,
they have little rooms on the side to discuss things, and he
showed me this paper. What Dirac said was the following: There is
in quantum mechanics a very important quantity which carries the
wave function from one time to another, besides the differential
equation but equivalent to it, a kind of a kernal, which we might
call K(x', x), which carries the wave
function j(x) known at time
t, to the wave function j(x') at time, t+e Dirac points out that this function
K was analogous to the quantity in classical
mechanics that you would calculate if you took the exponential of
ie, multiplied by the
Lagrangian imagining that these two positions
x,x' corresponded t and t+e. In other words,
Professor Jehle showed me this, I read it,
he explained it to me, and I said, "what does he mean, they are
analogous; what does that mean, analogous? What is the use
of that?" He said, "you Americans! You always want to find a use
for everything!" I said, that I thought that Dirac must mean that
they were equal. "No", he explained, "he doesn't mean they are
equal." "Well", I said, "let's see what happens if we make them
equal."
So I simply put them equal, taking the
simplest example where the Lagrangian is
½Mx2 - V(x) but soon found I
had to put a constant of proportionality A in, suitably
adjusted. When I substituted for K to get
and just calculated things out by Taylor
series expansion, out came the Schrödinger equation. So, I
turned to Professor Jehle, not really understanding, and said,
"well, you see Professor Dirac meant that they were
proportional." Professor Jehle's eyes were bugging out - he had
taken out a little notebook and was rapidly copying it down from
the blackboard, and said, "no, no, this is an important
discovery. You Americans are always trying to find out how
something can be used. That's a good way to discover things!" So,
I thought I was finding out what Dirac meant, but, as a matter of
fact, had made the discovery that what Dirac thought was
analogous, was, in fact, equal. I had then, at least, the
connection between the Lagrangian and quantum mechanics, but
still with wave functions and infinitesimal times.
It must have been a day or so later when I
was lying in bed thinking about these things, that I imagined
what would happen if I wanted to calculate the wave function at a
finite interval later.
I would put one of these factors
eieL in here, and
that would give me the wave functions the next moment,
t+e and then I could substitute
that back into (3) to get another factor of eieL and give me the wave function the
next moment, t+2e and so on and
so on. In that way I found myself thinking of a large number of
integrals, one after the other in sequence. In the integrand was
the product of the exponentials, which, of course, was the
exponential of the sum of terms like eL. Now, L is the Lagrangian and
e is like the time interval dt,
so that if you took a sum of such terms, that's exactly like an
integral. That's like Riemann's formula for the integral
Ldt, you just take the value at each
point and add them together. We are to take the limit as
e-0, of course. Therefore, the
connection between the wave function of one instant and the wave
function of another instant a finite time later could be obtained
by an infinite number of integrals, (because e goes to zero, of course) of exponential
where S is the action expression (2). At
last, I had succeeded in representing quantum mechanics directly
in terms of the action S.
This led later on to the idea of the
amplitude for a path; that for each possible way that the
particle can go from one point to another in space-time, there's
an amplitude. That amplitude is e to the times the action for the path. Amplitudes from various
paths superpose by addition. This then is another, a third way,
of describing quantum mechanics, which looks quite different than
that of Schrödinger or Heisenberg, but which is equivalent
to them.
Now immediately after making a few checks
on this thing, what I wanted to do, of course, was to substitute
the action (1) for the other (2). The first trouble was that I
could not get the thing to work with the relativistic case of
spin one-half. However, although I could deal with the matter
only nonrelativistically, I could deal with the light or the
photon interactions perfectly well by just putting the
interaction terms of (1) into any action, replacing the mass
terms by the non-relativistic (Mx2/2)dt.
When the action has a delay, as it now had, and involved more
than one time, I had to lose the idea of a wave function. That
is, I could no longer describe the program as; given the
amplitude for all positions at a certain time to compute the
amplitude at another time. However, that didn't cause very much
trouble. It just meant developing a new idea. Instead of wave
functions we could talk about this; that if a source of a certain
kind emits a particle, and a detector is there to receive it, we
can give the amplitude that the source will emit and the detector
receive. We do this without specifying the exact instant that the
source emits or the exact instant that any detector receives,
without trying to specify the state of anything at any particular
time in between, but by just finding the amplitude for the
complete experiment. And, then we could discuss how that
amplitude would change if you had a scattering sample in between,
as you rotated and changed angles, and so on, without really
having any wave functions.
It was also possible to discover what the
old concepts of energy and momentum would mean with this
generalized action. And, so I believed that I had a quantum
theory of classical electrodynamics - or rather of this new
classical electrodynamics described by action (1). I made a
number of checks. If I took the Frenkel field point of view,
which you remember was more differential, I could convert it
directly to quantum mechanics in a more conventional way. The
only problem was how to specify in quantum mechanics the
classical boundary conditions to use only half-advanced and
half-retarded solutions. By some ingenuity in defining what that
meant, I found that the quantum mechanics with Frenkel fields,
plus a special boundary condition, gave me back this action, (1)
in the new form of quantum mechanics with a delay. So, various
things indicated that there wasn't any doubt I had everything
straightened out.
It was also easy to guess how to modify the
electrodynamics, if anybody ever wanted to modify it. I just
changed the delta to an f, just as I would for the
classical case. So, it was very easy, a simple thing. To describe
the old retarded theory without explicit mention of fields I
would have to write probabilities, not just amplitudes. I would
have to square my amplitudes and that would involve double path
integrals in which there are two S's and so forth. Yet, as
I worked out many of these things and studied different forms and
different boundary conditions. I got a kind of funny feeling that
things weren't exactly right. I could not clearly identify the
difficulty and in one of the short periods during which I
imagined I had laid it to rest, I published a thesis and received
my Ph.D.
During the war, I didn't have time to work
on these things very extensively, but wandered about on buses and
so forth, with little pieces of paper, and struggled to work on
it and discovered indeed that there was something wrong,
something terribly wrong. I found that if one generalized the
action from the nice Langrangian forms (2) to these forms (1)
then the quantities which I defined as energy, and so on, would
be complex. The energy values of stationary states wouldn't be
real and probabilities of events wouldn't add up to 100%. That
is, if you took the probability that this would happen and that
would happen - everything you could think of would happen, it
would not add up to one.
Another problem on which I struggled very
hard, was to represent relativistic electrons with this new
quantum mechanics. I wanted to do a unique and different way -
and not just by copying the operators of Dirac into some kind of
an expression and using some kind of Dirac algebra instead of
ordinary complex numbers. I was very much encouraged by the fact
that in one space dimension, I did find a way of giving an
amplitude to every path by limiting myself to paths, which only
went back and forth at the speed of light. The amplitude was
simple (ie) to a power equal to
the number of velocity reversals where I have divided the time
into steps and I am allowed to reverse velocity only at such a
time. This gives (as approaches zero) Dirac's equation in two
dimensions - one dimension of space and one of time .
Dirac's wave function has four components
in four dimensions, but in this case, it has only two components
and this rule for the amplitude of a path automatically generates
the need for two components. Because if this is the formula for
the amplitudes of path, it will not do you any good to know the
total amplitude of all paths, which come into a given point to
find the amplitude to reach the next point. This is because for
the next time, if it came in from the right, there is no new
factor ie if it goes out to the
right, whereas, if it came in from the left there was a new
factor ie. So, to continue this same
information forward to the next moment, it was not sufficient
information to know the total amplitude to arrive, but you had to
know the amplitude to arrive from the right and the amplitude to
arrive to the left, independently. If you did, however, you could
then compute both of those again independently and thus you had
to carry two amplitudes to form a differential equation (first
order in time).
And, so I dreamed that if I were clever, I
would find a formula for the amplitude of a path that was
beautiful and simple for three dimensions of space and one of
time, which would be equivalent to the Dirac equation, and for
which the four components, matrices, and all those other
mathematical funny things would come out as a simple consequence
- I have never succeeded in that either. But, I did want to
mention some of the unsuccessful things on which I spent almost
as much effort, as on the things that did work.
To summarize the situation a few years
after the way, I would say, I had much experience with quantum
electrodynamics, at least in the knowledge of many different ways
of formulating it, in terms of path integrals of actions and in
other forms. One of the important by-products, for example, of
much experience in these simple forms, was that it was easy to
see how to combine together what was in those days called the
longitudinal and transverse fields, and in general, to see
clearly the relativistic invariance of the theory. Because of the
need to do things differentially there had been, in the standard
quantum electrodynamics, a complete split of the field into two
parts, one of which is called the longitudinal part and the other
mediated by the photons, or transverse waves. The longitudinal
part was described by a Coulomb potential acting instantaneously
in the Schrödinger equation, while the transverse part had
entirely different description in terms of quantization of the
transverse waves. This separation depended upon the relativistic
tilt of your axes in spacetime. People moving at different
velocities would separate the same field into longitudinal and
transverse fields in a different way. Furthermore, the entire
formulation of quantum mechanics insisting, as it did, on the
wave function at a given time, was hard to analyze
relativistically. Somebody else in a different coordinate system
would calculate the succession of events in terms of wave
functions on differently cut slices of space-time, and with a
different separation of longitudinal and transverse parts. The
Hamiltonian theory did not look relativistically invariant,
although, of course, it was. One of the great advantages of the
overall point of view, was that you could see the relativistic
invariance right away - or as Schwinger would say -
the covariance was manifest. I had the advantage, therefore, of
having a manifestedly covariant form for quantum electrodynamics
with suggestions for modifications and so on. I had the
disadvantage that if I took it too seriously - I mean, if I took
it seriously at all in this form, - I got into trouble with these
complex energies and the failure of adding probabilities to one
and so on. I was unsuccessfully struggling with that.
Then Lamb did his experiment,
measuring the separation of the 2S½ and
2P½ levels of hydrogen, finding it to be about 1000
megacycles of frequency difference. Professor Bethe, with whom I was
then associated at Cornell, is a man who has this characteristic:
If there's a good experimental number you've got to figure it out
from theory. So, he forced the quantum electrodynamics of the day
to give him an answer to the separation of these two levels. He
pointed out that the self-energy of an electron itself is
infinite, so that the calculated energy of a bound electron
should also come out infinite. But, when you calculated the
separation of the two energy levels in terms of the corrected
mass instead of the old mass, it would turn out, he thought, that
the theory would give convergent finite answers. He made an
estimate of the splitting that way and found out that it was
still divergent, but he guessed that was probably due to the fact
that he used an unrelativistic theory of the matter. Assuming it
would be convergent if relativistically treated, he estimated he
would get about a thousand megacycles for the Lamb-shift, and
thus, made the most important discovery in the history of the
theory of quantum electrodynamics. He worked this out on the
train from Ithaca, New York to Schenectady and telephoned me
excitedly from Schenectady to tell me the result, which I don't
remember fully appreciating at the time.
Returning to Cornell, he gave a lecture on
the subject, which I attended. He explained that it gets very
confusing to figure out exactly which infinite term corresponds
to what in trying to make the correction for the infinite change
in mass. If there were any modifications whatever, he said, even
though not physically correct, (that is not necessarily the way
nature actually works) but any modification whatever at high
frequencies, which would make this correction finite, then there
would be no problem at all to figuring out how to keep track of
everything. You just calculate the finite mass correction
Dm to the electron mass
mo, substitute the numerical values of
mo+Dm
for m in the results for any other problem and all these
ambiguities would be resolved. If, in addition, this method were
relativistically invariant, then we would be absolutely sure how
to do it without destroying relativistically invariant.
After the lecture, I went up to him and
told him, "I can do that for you, I'll bring it in for you
tomorrow." I guess I knew every way to modify quantum
electrodynamics known to man, at the time. So, I went in next
day, and explained what would correspond to the modification of
the delta-function to f and asked him to explain to me how
you calculate the self-energy of an electron, for instance, so we
can figure out if it's finite.
I want you to see an interesting point. I
did not take the advice of Professor Jehle to find out how it was
useful. I never used all that machinery which I had cooked up to
solve a single relativistic problem. I hadn't even calculated the
self-energy of an electron up to that moment, and was studying
the difficulties with the conservation of probability, and so on,
without actually doing anything, except discussing the general
properties of the theory.
But now I went to Professor Bethe, who
explained to me on the blackboard, as we worked together, how to
calculate the self-energy of an electron. Up to that time when
you did the integrals they had been logarithmically divergent. I
told him how to make the relativistically invariant modifications
that I thought would make everything all right. We set up the
integral which then diverged at the sixth power of the frequency
instead of logarithmically!
So, I went back to my room and worried
about this thing and went around in circles trying to figure out
what was wrong because I was sure physically everything had to
come out finite, I couldn't understand how it came out infinite.
I became more and more interested and finally realized I had to
learn how to make a calculation. So, ultimately, I taught myself
how to calculate the self-energy of an electron working my
patient way through the terrible confusion of those days of
negative energy states and holes and longitudinal contributions
and so on. When I finally found out how to do it and did it with
the modifications I wanted to suggest, it turned out that it was
nicely convergent and finite, just as I had expected. Professor
Bethe and I have never been able to discover what we did wrong on
that blackboard two months before, but apparently we just went
off somewhere and we have never been able to figure out where. It
turned out, that what I had proposed, if we had carried it out
without making a mistake would have been all right and would have
given a finite correction. Anyway, it forced me to go back over
all this and to convince myself physically that nothing can go
wrong. At any rate, the correction to mass was now finite,
proportional to where a is the width of
that function f which was substituted for d. If you wanted an unmodified electrodynamics,
you would have to take a equal to zero, getting an
infinite mass correction. But, that wasn't the point. Keeping a
finite, I simply followed the program outlined by Professor Bethe
and showed how to calculate all the various things, the
scatterings of electrons from atoms without radiation, the shifts
of levels and so forth, calculating everything in terms of the
experimental mass, and noting that the results as Bethe
suggested, were not sensitive to a in this form and even
had a definite limit as ag0.
The rest of my work was simply to improve
the techniques then available for calculations, making diagrams
to help analyze perturbation theory quicker. Most of this was
first worked out by guessing - you see, I didn't have the
relativistic theory of matter. For example, it seemed to me
obvious that the velocities in non-relativistic formulas have to
be replaced by Dirac's matrix a or in
the more relativistic forms by the operators . I just took my guesses from the forms that I had worked
out using path integrals for nonrelativistic matter, but
relativistic light. It was easy to develop rules of what to
substitute to get the relativistic case. I was very surprised to
discover that it was not known at that time, that every one of
the formulas that had been worked out so patiently by separating
longitudinal and transverse waves could be obtained from the
formula for the transverse waves alone, if instead of summing
over only the two perpendicular polarization directions you would
sum over all four possible directions of polarization. It was so
obvious from the action (1) that I thought it was general
knowledge and would do it all the time. I would get into
arguments with people, because I didn't realize they didn't know
that; but, it turned out that all their patient work with the
longitudinal waves was always equivalent to just extending the
sum on the two transverse directions of polarization over all
four directions. This was one of the amusing advantages of the
method. In addition, I included diagrams for the various terms of
the perturbation series, improved notations to be used, worked
out easy ways to evaluate integrals, which occurred in these
problems, and so on, and made a kind of handbook on how to do
quantum electrodynamics.
But one step of importance that was
physically new was involved with the negative energy sea of
Dirac, which caused me so much logical difficulty. I got so
confused that I remembered Wheeler's old idea about the positron
being, maybe, the electron going backward in time. Therefore, in
the time dependent perturbation theory that was usual for getting
self-energy, I simply supposed that for a while we could go
backward in the time, and looked at what terms I got by running
the time variables backward. They were the same as the terms that
other people got when they did the problem a more complicated
way, using holes in the sea, except, possibly, for some signs.
These, I, at first, determined empirically by inventing and
trying some rules.
I have tried to explain that all the
improvements of relativistic theory were at first more or less
straightforward, semi-empirical shenanigans. Each time I would
discover something, however, I would go back and I would check it
so many ways, compare it to every problem that had been done
previously in electrodynamics (and later, in weak coupling meson
theory) to see if it would always agree, and so on, until I was
absolutely convinced of the truth of the various rules and
regulations which I concocted to simplify all the work.
During this time, people had been
developing meson theory, a subject I had not studied in any
detail. I became interested in the possible application of my
methods to perturbation calculations in meson theory. But, what
was meson theory? All I knew was that meson theory was something
analogous to electrodynamics, except that particles corresponding
to the photon had a mass. It was easy to guess the d-function in (1), which was a solution of
d'Alembertian equals zero, was to be changed to the corresponding
solution of d'Alembertian equals m2. Next,
there were different kind of mesons - the one in closest analogy
to photons, coupled via , are called vector mesons - there were also scalar mesons.
Well, maybe that corresponds to putting unity in place of the
, I would here then speak of "pseudo vector
coupling" and I would guess what that probably was. I didn't have
the knowledge to understand the way these were defined in the
conventional papers because they were expressed at that time in
terms of creation and annihilation operators, and so on, which, I
had not successfully learned. I remember that when someone had
started to teach me about creation and annihilation operators,
that this operator creates an electron, I said, "how do you
create an electron? It disagrees with the conservation of
charge", and in that way, I blocked my mind from learning a very
practical scheme of calculation. Therefore, I had to find as many
opportunities as possible to test whether I guessed right as to
what the various theories were.
One day a dispute arose at a Physical
Society meeting as to the correctness of a calculation by
Slotnick of the interaction of an electron with a neutron using
pseudo scalar theory with pseudo vector coupling and also, pseudo
scalar theory with pseudo scalar coupling. He had found that the
answers were not the same, in fact, by one theory, the result was
divergent, although convergent with the other. Some people
believed that the two theories must give the same answer for the
problem. This was a welcome opportunity to test my guesses as to
whether I really did understand what these two couplings were.
So, I went home, and during the evening I worked out the electron
neutron scattering for the pseudo scalar and pseudo vector
coupling, saw they were not equal and subtracted them, and worked
out the difference in detail. The next day at the meeting, I saw
Slotnick and said, "Slotnick, I worked it out last night, I
wanted to see if I got the same answers you do. I got a different
answer for each coupling - but, I would like to check in detail
with you because I want to make sure of my methods." And, he
said, "what do you mean you worked it out last night, it took me
six months!" And, when we compared the answers he looked at mine
and he asked, "what is that Q in there, that variable
Q?" (I had expressions like (tan -1Q)
/Q etc.). I said, "that's the momentum transferred by the
electron, the electron deflected by different angles." "Oh", he
said, "no, I only have the limiting value as Q approaches
zero; the forward scattering." Well, it was easy enough to just
substitute Q equals zero in my form and I then got the
same answers as he did. But, it took him six months to do the
case of zero momentum transfer, whereas, during one evening I had
done the finite and arbitrary momentum transfer. That was a
thrilling moment for me, like receiving the Nobel Prize, because
that convinced me, at last, I did have some kind of method and
technique and understood how to do something that other people
did not know how to do. That was my moment of triumph in which I
realized I really had succeeded in working out something
worthwhile.
At this stage, I was urged to publish this
because everybody said it looks like an easy way to make
calculations, and wanted to know how to do it. I had to publish
it, missing two things; one was proof of every statement in a
mathematically conventional sense. Often, even in a physicist's
sense, I did not have a demonstration of how to get all of these
rules and equations from conventional electrodynamics. But, I did
know from experience, from fooling around, that everything was,
in fact, equivalent to the regular electrodynamics and had
partial proofs of many pieces, although, I never really sat down,
like Euclid did for the geometers of Greece, and made sure that
you could get it all from a single simple set of axioms. As a
result, the work was criticized, I don't know whether favorably
or unfavorably, and the "method" was called the "intuitive
method". For those who do not realize it, however, I should like
to emphasize that there is a lot of work involved in using this
"intuitive method" successfully. Because no simple clear proof of
the formula or idea presents itself, it is necessary to do an
unusually great amount of checking and rechecking for consistency
and correctness in terms of what is known, by comparing to other
analogous examples, limiting cases, etc. In the face of the lack
of direct mathematical demonstration, one must be careful and
thorough to make sure of the point, and one should make a
perpetual attempt to demonstrate as much of the formula as
possible. Nevertheless, a very great deal more truth can become
known than can be proven.
It must be clearly understood that in all
this work, I was representing the conventional electrodynamics
with retarded interaction, and not my half-advanced and
half-retarded theory corresponding to (1). I merely use (1) to
guess at forms. And, one of the forms I guessed at corresponded
to changing d to a function f
of width a2, so that I could calculate finite
results for all of the problems. This brings me to the second
thing that was missing when I published the paper, an unresolved
difficulty. With d replaced by
f the calculations would give results which were not
"unitary", that is, for which the sum of the probabilities of all
alternatives was not unity. The deviation from unity was very
small, in practice, if a was very small. In the limit that
I took a very tiny, it might not make any difference. And,
so the process of the renormalization could be made, you could
calculate everything in terms of the experimental mass and then
take the limit and the apparent difficulty that the unitary is
violated temporarily seems to disappear. I was unable to
demonstrate that, as a matter of fact, it does.
It is lucky that I did not wait to
straighten out that point, for as far as I know, nobody has yet
been able to resolve this question. Experience with meson
theories with stronger couplings and with strongly coupled vector
photons, although not proving anything, convinces me that if the
coupling were stronger, or if you went to a higher order (137th
order of perturbation theory for electrodynamics), this
difficulty would remain in the limit and there would be real
trouble. That is, I believe there is really no satisfactory
quantum electrodynamics, but I'm not sure. And, I believe, that
one of the reasons for the slowness of present-day progress in
understanding the strong interactions is that there isn't
any relativistic theoretical model, from which you can really
calculate everything. Although, it is usually said, that the
difficulty lies in the fact that strong interactions are too hard
to calculate, I believe, it is really because strong interactions
in field theory have no solution, have no sense they're either
infinite, or, if you try to modify them, the modification
destroys the unitarity. I don't think we have a completely
satisfactory relativistic quantum-mechanical model, even one that
doesn't agree with nature, but, at least, agrees with the logic
that the sum of probability of all alternatives has to be 100%.
Therefore, I think that the renormalization theory is simply a
way to sweep the difficulties of the divergences of
electrodynamics under the rug. I am, of course, not sure of
that.
This completes the story of the development
of the space-time view of quantum electrodynamics. I wonder if
anything can be learned from it. I doubt it. It is most striking
that most of the ideas developed in the course of this research
were not ultimately used in the final result. For example, the
half-advanced and half-retarded potential was not finally used,
the action expression (1) was not used, the idea that charges do
not act on themselves was abandoned. The path-integral
formulation of quantum mechanics was useful for guessing at final
expressions and at formulating the general theory of
electrodynamics in new ways - although, strictly it was not
absolutely necessary. The same goes for the idea of the positron
being a backward moving electron, it was very convenient, but not
strictly necessary for the theory because it is exactly
equivalent to the negative energy sea point of view.
We are struck by the very large number of
different physical viewpoints and widely different mathematical
formulations that are all equivalent to one another. The method
used here, of reasoning in physical terms, therefore, appears to
be extremely inefficient. On looking back over the work, I can
only feel a kind of regret for the enormous amount of physical
reasoning and mathematically re-expression which ends by merely
re-expressing what was previously known, although in a form which
is much more efficient for the calculation of specific problems.
Would it not have been much easier to simply work entirely in the
mathematical framework to elaborate a more efficient expression?
This would certainly seem to be the case, but it must be remarked
that although the problem actually solved was only such a
reformulation, the problem originally tackled was the (possibly
still unsolved) problem of avoidance of the infinities of the
usual theory. Therefore, a new theory was sought, not just a
modification of the old. Although the quest was unsuccessful, we
should look at the question of the value of physical ideas in
developing a new theory.
Many different physical ideas can describe
the same physical reality. Thus, classical electrodynamics can be
described by a field view, or an action at a distance view, etc.
Originally, Maxwell filled space with idler wheels, and Faraday
with fields lines, but somehow the Maxwell equations themselves
are pristine and independent of the elaboration of words
attempting a physical description. The only true physical
description is that describing the experimental meaning of the
quantities in the equation - or better, the way the equations are
to be used in describing experimental observations. This being
the case perhaps the best way to proceed is to try to guess
equations, and disregard physical models or descriptions. For
example, McCullough guessed the correct equations for light
propagation in a crystal long before his colleagues using elastic
models could make head or tail of the phenomena, or again, Dirac
obtained his equation for the description of the electron by an
almost purely mathematical proposition. A simple physical view by
which all the contents of this equation can be seen is still
lacking.
Therefore, I think equation guessing might
be the best method to proceed to obtain the laws for the part of
physics which is presently unknown. Yet, when I was much younger,
I tried this equation guessing and I have seen many students try
this, but it is very easy to go off in wildly incorrect and
impossible directions. I think the problem is not to find the
best or most efficient method to proceed to a discovery,
but to find any method at all. Physical reasoning does help some
people to generate suggestions as to how the unknown may be
related to the known. Theories of the known, which are described
by different physical ideas may be equivalent in all their
predictions and are hence scientifically indistinguishable.
However, they are not psychologically identical when trying to
move from that base into the unknown. For different views suggest
different kinds of modifications which might be made and hence
are not equivalent in the hypotheses one generates from them in
ones attempt to understand what is not yet understood. I,
therefore, think that a good theoretical physicist today might
find it useful to have a wide range of physical viewpoints and
mathematical expressions of the same theory (for example, of
quantum electrodynamics) available to him. This may be asking too
much of one man. Then new students should as a class have this.
If every individual student follows the same current fashion in
expressing and thinking about electrodynamics or field theory,
then the variety of hypotheses being generated to understand
strong interactions, say, is limited. Perhaps rightly so, for
possibly the chance is high that the truth lies in the
fashionable direction. But, on the off-chance that it is in
another direction - a direction obvious from an unfashionable
view of field theory - who will find it? Only someone who has
sacrificed himself by teaching himself quantum electrodynamics
from a peculiar and unusual point of view; one that he may have
to invent for himself. I say sacrificed himself because he most
likely will get nothing from it, because the truth may lie in
another direction, perhaps even the fashionable one.
But, if my own experience is any guide, the
sacrifice is really not great because if the peculiar viewpoint
taken is truly experimentally equivalent to the usual in the
realm of the known there is always a range of applications and
problems in this realm for which the special viewpoint gives one
a special power and clarity of thought, which is valuable in
itself. Furthermore, in the search for new laws, you always have
the psychological excitement of feeling that possible nobody has
yet thought of the crazy possibility you are looking at right
now.
So what happened to the old theory that I
fell in love with as a youth? Well, I would say it's become an
old lady, that has very little attractive left in her and the
young today will not have their hearts pound anymore when they look at
her. But, we can say the best we can for any old woman,
that she has been a very good mother and she has given birth to
some very good children. And, I thank the Swedish Academy of
Sciences for complimenting one of them. Thank you.
From Nobel Lectures, Physics 1963-1970, Elsevier Publishing Company, Amsterdam, 1972