Calculus

Lesson 65

Calculus of Variations

Back to Dr. Nandor's Calculus Notes Page

 Back to Dr. Nandor's Calculus Page

 

 

Thanks to Dr. Helliwell of Harvey Mudd College - this lesson essentially comes from the class notes from his Theoretical Mechanics class at Mudd in 1992-1993.

 

                        Let's say that we know that there is some function that depends

                                    on the path we take.  It doesn't matter what the function

                                    is:  it could be friction, or work, or earned wages - whatever.

                                    , where  is the path taken.  So if the function

                                    were wages, perhaps the wages depend on which path is taken, and

                                    how fast each section of the path is tread.

 

                                    If we wanted to add up all of the wages, we would have to do

                                    an integral:

 

                                               

 

                                     is called a "Functional."  It is NOT necessarily a function!

 

 

                        Now, let us suppose that we want to evaluate .  In theory, we can,

                                    at the very least, numerically integrate it if we know what  is.

 

 

                        The question is:  if we know what outcome we want, can we find a ?

                                    More specifically, what if we want to minimize or maximize ?  Can

                                    we find what  must be?  Does such a function even exist?

 

                       

 

 
 

                                                

 

 

 

 

 

            Well, let us, for the moment, assume that it does exist, and we'll call it .  Let's

            also say that any other function , where  is

            the difference between  and  at any given point.  It will be

            easier, for our purposes, actually substitute  into  instead

            of , since we have a way to vary  as we shall see.  Note that if

            , that  so we should be able to make the substitution

            without any ill effects, as long as  at the end.

 

 

 

                       

 

 
   

                                                       

        

 

 

 

            We want some way to plug  into our functional and to manipulate it to reduce

                        it to  when we are minimizing the functional.  Thus we will want to

                        see how .  So, let's say

           

                       

 

                        which is a power series.   is called the "smallness parameter."  As we make

                        , we certainly make ; however, note that we also make

                        the higher order functions unimportant, since they are multiplied by

                        higher powers of .    Doing the above is similar to the statement we always

                        make about a function looking a like a line if we zoom in really close to it.  In

                        the same way, as we zoom in on  as , it looks like a

                        multiple of the same function .  Therefore, it doesn't matter which

                         we use, it is just some multiple of the function  as long as

                        we zoom in close enough.  So we have reduced the problem from one with

                        an infinite number of s to one where we always have the same

                        small difference function, , and we only vary the size of a constant, .

 

 

 

            Now, let's concentrate on .  We know some things about it. 

                         since  at the endpoints.  Plugging

                        everything into our functional, then,

 

                                     

                                    where we know

                                                                                    

            SO.  What is the method we will use to find the extremum of our functional?

 

                        The same thing we always do!  Take the derivative with respect to something

                        that must go to zero:

                                               

 

            This is especially convenient since  is not in either integration limit so we can

                        bring the derivative inside the integral.

 

                                               

 

            So what we have so far is:

                       

 

 

We can integrate the second integral by parts.

 

                       

 

 

 

 

 

                                            This term=0 since  at the endpoints

                                   

 

 

                                    So, the derivative of our functional is:

 

                                   

 

                                    We are looking for an extremum, so combining the integrals and

                                                setting the integral = 0, we get:

 

                                   

 

                        The above equation must be true for ANY arbitrary  (provided that

                                    it goes to zero at the endpoints), so the only way that the integral

                                    will ALWAYS be zero, is if what is in the parentheses is always zero.

                                    Note that if we take the limit now as , we arrive at

                                    our extremum path instead of one that is merely close to our

                                    extremum path.

                                                                       

 

 

                        This is called Euler's Equation (EI).  For the integral to be an extremum,

                                    the function inside the integral MUST obey this equation.

 

                         Think back to what we were doing to recap:  We are trying to find

                                    the function that extremizes the integral.  By saying that we were

                                    using a function close to, but not quite, the minimizing function,

                                    we were able to derive an equation that the minimizing function

                                    must obey.  We haven't done any examples yet, but at least we know

                                    now what one of the conditions placed on the function must be.                   

                        Also remember that all we've done is find a condition for the integral to

                                    be an extremum Ð we haven't specified whether that extremum

                                    is a min, max, or stationary.  However, usually we can use common

                                    sense.

 

                        Example:  if , write out Euler's Equation.

 

                                   

 

                        Of course we don't know how to solve this equation, but it's a start.

 

 

 

 

Let's do one we can solve:  Find the shortest distance between two points.

                       

                                   

                                    Using EI:

 

                                    where     

 

                                    So doing a bit of algebra:

 

                                   

 

                                    The shortest distance between two points is a line!!!!!!

 

 

 

                        But we won't always be able to solve EI!

 

                        Let's find another formulation of Euler's Equation that might make it

                        easier for us to solve.

 

                        Let's consider   and  .  Why?  Because I'm telling you

                        that we can get an easier form of Euler's Equation if we do!  So there!

 

                       

 

                                               

 

                                                But we know from EI that , so

 

 

                                                                   *

 

 

 

                       

                                                The other part of what we're looking at:

 

                                                                      **

 

 

 

 

                                    Subtracting the two results (** - *), we get

 

                                                                       This is (EII)

 

 

 

                        The reason EII is useful is that often we will not have a function that

                                    explicitly a function of x, which means .  When this is the case,

 

 

                                               

 

 

                        So, recap. 

                                    Use EII when  is not explicate a function of .

                                                Then we can move straight to

 

 

                                    Use EI when  is not explicate a function of .

                                                Then we can move straight to

 

 

 

 

                                    The reason this is called variational calculus is that small variations

                        of the path do not make a large change in our functional.  Think of it like

                        at the top of a parabola, small variations in x do not greatly change the

                        slope.  However, when we are NOT near the vertex, a small change has

                        a large effect on the slope.

 

 

                        Note that when we look at our previous line example,  

                                    is an explicit function of neither  nor , so we could use either

                                    EI or EII (although we would still use EI since although we know

                                    , we don't know  so we wouldn't be able

                                    to solve EII.

 

 

 

 

                        Now, let's see the problem that caused Newton to essentially invent the

                        calculus of variations:  the brachistochrone.

 

 

                        When only gravity is considered, what is the fastest path between two points?

                                    It's not a line.  Note that the sooner we speed up, the faster we'll be.

                                    The answer isn't straight down first since then we waste time by taking

                                    a path that's too long.  So.  We know what it should look like,

                                    at least approximately:                   A

                                                                                                                               B

 

 

 

 

                        Let's minimize time:                    

 

 

 

                                    We know  is the differential path length.

                                    We can use the Work-Energy Equation (transferring potential to

                                    kinetic energy, which we have done innumerable times) to find that

                                     at any given point.

 

 

 

 

                                   

 

 

 

 

                                    Plugging this into EII (since  has no explicit -

                                                dependence,

 

 

 

 

 

                                   

 

 

 

We will call  to simplify things.  Also, we only choose the +ve answer since

            y is +ve down!

 

           

                                                           

 

 

 

I'll tell you the substitution to make:           

 

                        We can at least see why this might work!

 

                                                           

 

                        We can solve for  by saying that are starting point is .  This gives

                                    us

 

 

                        Since we have already said what  is, we now have a parametric solution

                        to the problem:

 

                                   

 

 

                                    These equations describe what is called a "cycloid."  It is the shape that

                        is traced by any point on  a circle as it rolls without slipping on a plane.  If we

                        are given a final , we can solve for the constant, .

 

                                                                           

   A

                                                                       

                            A                                                                                           B

 

 

 

            How would we find the time it takes to get from A to B?

 

           

 

                                                           

 

 

                        Where

 

 

 

 

                           After lot's o' algebra...

                                

 

 

                        Note that the time it would take to get back to the bottom is when

                                   

                        Where  is the bottom of the bowl (from our parametric equations).

 

 

 

 

            Note that this is obviously a minimum time path.  A maximum time path

                        could go out to infinity and back!

 

 

 

 

 

 

            Example:  A curve connects A and B, and this curve is rotated about an axis.

                                    Find the curve that minimizes the surface area.  This is the

                                    lamp-shade problem.  The lamp-shade manufacturers want

                                    to use the least amount of material possible while maintaining

                                    the connection between the small diameter top to the large

                                    diameter bottom.

 

                                                Are any of these right?  Which one looks like the least?

           

 

 

 

 

 

 

 

 

 

 

                       

 

 

                        There is no explicit dependence on y, so use EI:

 

                                   

This shape is called a "catenary" and also happens to be the shape of a hanging chain.

            Example:  As we saw in the reading, the speed of light travels in such a way

                                    as to make the time an extremum.  Between two points,

                                    this means it takes the shortest distance possible, a line.  Between

                                    two points such that it must hit a parabolic mirror first means that

                                    it travels to hit the mirror and then on to the next point in

                                    a straight line, but that path is a "stationary" path:  it is shortest

                                    in one sense that it essentially travels in a line, but it is longest

                                    in another sense since there are other straight-line paths

                                    that would take a shorter amount of time.

 

 

 

 longest straight-line path

 

 

 

 

 

 

 

 

 

 

                                    So, we know that we are looking to find a place where small

                        variations in the path do not change the path time significantly.

 

 

 

                        Let's derive Snell's Law.

 
 

 

 

 

 

                        We already know that light travels in a straight line, so the time it takes

                        to travel in that line is:     

 

                        The total time, then is:      .

 

                        We will want to extremize the total time.

 

                        The endpoints are fixed, so  and  are fixed, but  and  will

                        vary depending on where the light hits the glass.  To reduce our equation

                        to a single variable, we will use the notation  for  and instead of

                         we will use  since  is also fixed.

 

                        The total travel time becomes:

                       

 

 

                        The extreme time path can be found by taking the derivative with respect

                        to the only value that is changing!

 

 

                       

 

            Setting this equal to zero, we find that

                                   

                                                                                    Which of course we know as Snell's Law!

 

                        Finally (last one!), what happens as light goes through the atmosphere?  As

                        the air gets more and more dense (due to gravity), the index of refraction

                        of air changes continuously.  We could model it like this:

 

 

 
 

 

 

           

 

                        Of course, in the limit that the thickness of the layers goes to zero, it

                        will look like this:

 

 

                                   

 

 

 

 

 

 

 

 

 

 

                        Our goal is to derive a formula to find qf given that we know q0 and the

                                    function for the index of refraction, n(y).

 

 

 

                        It's actually pretty much the same procedure!

                                               

 

 

                                    Use EII since there is no explicit x-dependence!

 

                       

 

 

            But at any given point, we know that at any given point  since the

                        light travels in a straight line in a differential sheet of atmosphere and

                         is measured compared to the vertical.

 

                                                           

 

                        Which is just Snell's Law!!  So all we need to know is the index of refraction

                        at the top and bottom, and the angle at the top.

 

                       

 

 

 

 

 

The index of refraction at the top of the atmosphere is , so all we really need to

            know is the index of refraction of the level at which we want to look, and the

            incident angle.                    

 

 

                                                           

 

 

 

 

 

            If we want to solve for  or , then the only additional thing we must do

                        is go back and integrate the derivative we found:

 

 

                                   

 

 

                        We can solve for  when we integrate and put in the initial condition of

                        whatever  is.

 

 

 

                        If, for example, we know that , where  is constant.

 

 

                       

                                   

 

 

 

                                    Since  for any point, then

 

 

                                   

 

            So we have found a function for the x-position given the y-position and

                        the initial angle.

 

 

 

Back to Dr. Nandor's Calculus Notes Page

 Back to Dr. Nandor's Calculus Page