Online Textbook Practice Tests 1500 Calculus Problems Solved About

2.4: Chain Rule

We know how to differentiate sums, differences, products, and quotients of functions. But we have yet to discuss how to differentiate composite functions—for example, the derivative of \(\sin(4x + 3)\) or the derivative of \(\sqrt{x^2 + 2}.\) In this section, upon learning the Chain Rule, we'll be able to differentiate many composite functions. This topic is, arguably, the most important to differential calculus since it is a stepping stone to further units. This section discusses the following topics:

Chain Rule

A composite function can be written in the form \begin{equation*} F(x) = f(g(x)) \cma \end{equation*} where \(f\) is the outer function and \(g\) is the inner function. Alternatively, we could write \(F(x) = (f \circ g)(x).\) Now consider \(F'(x),\) the rate of change of \(F\) with respect to \(x.\) It is logical to assume that \(F'(x)\) depends on both \(f'\) and \(g'.\) In fact, it turns out that \(F'(x)\) is the product of these derivatives, as stated by the Chain Rule: \begin{equation} F'(x) = f'(g(x)) \cdot g'(x) \pd \label{eq:chain-rule} \end{equation}

Chain Rule Expressed in Leibniz Notation Alternatively, we can express \(\eqref{eq:chain-rule}\) in Leibniz notation. For the composite function \(y = f(g(x)),\) we let \(u = g(x)\) to get \(y = f(u).\) We note that \(y\) changes with the variable \(u,\) which changes with \(x.\) Calculating the rate at which \(y\) changes with \(u\) doesn't need the Chain Rule. But we do need the rule to determine the rate of change of \(y\) with respect to \(x.\) In Leibniz notation, we write the Chain Rule as \begin{equation} \deriv{y}{x} = \deriv{y}{u} \deriv{u}{x} \pd \label{eq:chain-leib} \end{equation} We could alternatively write \[\deriv{y}{x} = \deriv{f}{g} \deriv{g}{x} \cma\] in which we view \(f\) and \(g\) as variables such that \(f\) depends on \(g\) and \(g\) depends on \(x.\) Both forms reveal the magic of Leibniz's notation—\(\eqrefer{eq:chain-leib}\) is easy to remember if you imagine canceling the differential \(\dd u.\) We see the following:

If \(y\) changes \(a\) times as fast as \(u,\) and \(u\) changes \(b\) times as fast as \(x,\) then \(y\) changes \(ab\) times as fast as \(x.\)
For example, imagine that a bicycle travels \(6\) times faster than a walker, and that a car drives \(5\) times faster than the bicycle. The car therefore travels \(6 \times 5\) \(= 30\) times faster than the walker.

Applying the Chain Rule is similar to peeling an onion, in which we start peeling from the outer layer (initially ignoring the inner core) and gradually moving inward. Similarly, when differentiating a composite function, we use the differentiation operation on the outermost layer \((f)\) before moving on to differentiate the inner layer \((g).\)

THE CHAIN RULE
If \(g\) is differentiable at \(x\) and \(f\) is differentiable at \(g(x),\) then the derivative of the composite function \(F(x) = f(g(x))\) is given by \begin{equation} F'(x) = f'(g(x)) \cdot g'(x) \pd \eqlabel{eq:chain-rule} \end{equation} In Leibniz notation, if \(y = f(g(x))\) and \(u = g(x),\) then \begin{equation} \deriv{y}{x} = \deriv{y}{u} \deriv{u}{x} \pd \eqlabel{eq:chain-leib} \end{equation}
EXAMPLE 1
\[\deriv{}{x} \sin(x^2 + 1)\]
The function \(\sin(x^2 + 1)\) is composite, so differentiating it requires the Chain Rule. We have two methods to do so.

Method 1 In the composite function \(\sin(x^2 + 1),\) we identify the outer function to be \(f(x) = \sin x\) and the inner function to be \(g(x) = x^2 + 1.\) Note that \[f'(x) = \cos x \and g'(x) = 2x \pd\] To obtain \(f'(g(x)),\) in \(f'(x) = \cos x\) we replace \(x\) with \(g(x) = x^2 + 1 \col\) \[f'(g(x)) = \cos \parbr{g(x)} = \cos \par{x^2 + 1} \pd\] Thus, by \(\eqref{eq:chain-rule}\) (the Chain Rule), the derivative of \(f(g(x))\) is \[ \ba f'(g(x)) \cdot g'(x) &= \cos \par{x^2 + 1} \cdot 2x \nl &= \boxed{2x \cos \par{x^2 + 1}} \ea \]

Method 2 In \(y = \sin(x^2 + 1),\) we assign some variable to be the inner function—say, \(u = x^2 + 1.\) Then we have \(y = \sin u,\) from which we see \[\deriv{y}{u} = \cos u \pd\] Also, \[\deriv{u}{x} = \deriv{}{x} \par{x^2 + 1} = 2x \pd\] Hence, by \(\eqref{eq:chain-leib}\) the derivative of \(y\) with respect to \(x\) is given by \[\deriv{y}{x} = \deriv{y}{u} \deriv{u}{x} = (\cos u) (2x) \pd\] But since we chose \(u = x^2 + 1,\) substituting back gives \[ \ba \deriv{y}{x} &= \cos \par{x^2 + 1} \cdot 2x \nl &= \boxed{2x \cos \par{x^2 + 1}} \ea \]

EXAMPLE 2
\[\deriv{}{x} \par{\sqrt{x^3 - 2} \,}\]

Method 1 The function \(\sqrt{x^3 - 2}\) is composite; the outer function is \(f(x) = \sqrt x,\) and the inner function is \(g(x) = x^3 - 2.\) Observe that \(f'(x) = 1/(2 \sqrt x),\) so \[f'(g(x)) = \frac{1}{2 \sqrt{g(x)}} = \frac{1}{2 \sqrt{x^3 - 2}} \pd\] Also, \(g'(x) = 3x^2.\) Thus, the Chain Rule, as given by \(\eqrefer{eq:chain-rule},\) gives the derivative of \(\sqrt{x^3 - 2}\) to be \[ \ba f'(g(x)) \cdot g'(x) &= \frac{1}{2 \sqrt{x^3 - 2}} \cdot 3x^2 \nl &= \boxed{\frac{3x^2}{2 \sqrt{x^3 - 2}}} \ea \]

Method 2 In \(y = \sqrt{x^3 - 2},\) we choose some variable \(u\) to be the inner function. Thus, let \(u = x^3 - 2.\) Observe that \[\deriv{y}{u} = \deriv{}{u} \sqrt{u} = \frac{1}{2 \sqrt u} = \frac{1}{2 \sqrt{x^3 - 2}}\] and \[\deriv{u}{x} = \deriv{}{x} \par{x^3 - 2} = 3x^2 \pd\] Therefore, by \(\eqref{eq:chain-leib}\) \[ \ba \deriv{y}{x} &= \deriv{y}{u} \deriv{u}{x} \nl &= \frac{1}{2 \sqrt{x^3 - 2}} \cdot 3x^2 \nl &= \boxed{\frac{3x^2}{2 \sqrt{x^3 - 2}}} \ea \]

EXAMPLE 3
\[\deriv{}{x} \par{\sec^2 x}\]
The function \(\sec^2 x\) doesn't appear to be composite. But upon rewriting it as \((\sec x)^2,\) we identify the outer function to be \(f(x) = x^2\) and the inner function to be \(g(x) = \sec x.\) Notice that \(f'(x) = 2x\) and \(g'(x) = \sec x \tan x.\) Thus, the Chain Rule gives the derivative of \(\sec^2 x\) to be \[f'(g(x)) \cdot g'(x) = (2 \sec x)(\sec x \tan x) = \boxed{2 \sec^2 x \tan x}\]

General Power Rule

Very often, we use the Chain Rule in conjunction with other differentiation rules. Differentiation rules aren't exclusive; in other words, differentiating some functions requires a combination of rules. In Example 2 and Example 3, we used the Power Rule in conjunction with the Chain Rule. Now let us generalize this combination: Let \(g\) be a differentiable function, and consider the family of functions \[y = [g(x)]^n\] for some number \(n.\) To differentiate \(y,\) assuming it is defined, we let \(u = g(x).\) By the Power Rule, we see \[\deriv{y}{u} = \deriv{}{u} \par{u^n} = n u^{n - 1} \pd\] Then by the Chain Rule, \[ \ba \deriv{y}{x} &= \deriv{y}{u} \deriv{u}{x} \nl &= n u^{n - 1} \cdot g'(x) \nl &= n[g(x)]^{n - 1} \cdot g'(x) \pd \ea \] This equation is called the General Power Rule. It is a special case of the Chain Rule—an application of the Chain Rule—for when the outer function is a power function. Because the General Power Rule is derived from the Chain Rule, it is not a new concept but rather a convenient formula to know.

THE GENERAL POWER RULE
If \(g\) is a differentiable function and \(n\) is a constant, then \begin{equation} \deriv{}{x} [g(x)]^n = n \parbr{g(x)}^{n - 1} \cdot g'(x) \cma \label{eq:chain-power} \end{equation} where \([g(x)]^n\) is defined.

In \(\eqref{eq:chain-power}\) if \(g(x) = x,\) then \(g'(x) = 1\) and so the formula becomes \[\deriv{}{x} x^n = nx^{n - 1} \cdot 1 = nx^{n - 1} \cma\] the raw Power Rule. In this special case of \(g,\) the Chain Rule agrees with the Power Rule we have previously established.

EXAMPLE 4
\[\deriv{}{x} \parbr{\par{1 + 2x^3}^{100}}\]
Using the General Power Rule, as in \(\eqref{eq:chain-power},\) with \(n = 100\) and \(g(x) = 1 + 2x^3\) gives the derivative to be \[ \ba 100 \par{1 + 2x^3}^{99} \cdot g'(x) &= 100 \par{1 + 2x^3}^{99} \cdot 6x^2 \nl &= \boxed{600 x^2 \par{1 + 2x^3}^{99}} \ea \] By using the Chain Rule, we solved—in about two minutes—a problem that would otherwise require dozens of hours expanding terms.
EXAMPLE 5
\[\deriv{}{x} \parbrBig{(4 - 2x)^5 \par{9x^2 + 4}^{10}}\]
We treat the expression as one product formed by two composite functions: \((4 - 2x)^5\) and \(\par{9x^2 + 4}^{10}.\) Thus, we first apply the Product Rule and then use the Chain Rule to differentiate each function. Using \(\eqref{eq:chain-power},\) we see \[ \ba \deriv{}{x} \parbrBig{(4 - 2x)^5} &= -10(4 - 2x)^4 \cma \nl \deriv{}{x} \parbrBig{\par{9x^2 + 4}^{10}} &= 180x \par{9x^2 + 4}^9 \pd \ea \] By the Product Rule, \[ \ba \deriv{}{x} \parbrBig{(4 - 2x)^5 \par{9x^2 + 4}^{10}} &= \par{9x^2 + 4}^{10} \deriv{}{x} \parbrBig{(4 - 2x)^5} + (4 - 2x)^5 \deriv{}{x} \parbrBig{\par{9x^2 + 4}^{10}} \nl &= \boxed{180x \par{9x^2 + 4}^9 (4 - 2x)^5 - 10(4 - 2x)^4 \par{9x^2 + 4}^{10}} \ea \]

Differentiating \(b^x\)

We know that the derivative of \(e^x\) is \(e^x,\) since \(e = 2.718 \dots\) is the only number whose corresponding exponential function matches its derivative. But to differentiate an exponential function whose base isn't \(e,\) we force the base to be \(e\) (thus producing a composite function) and use the Chain Rule. Let \(b\) be any positive number, in the family of functions \(y = b^x.\) Our goal is to differentiate the family. We use the Change of Base formula for exponents: \[y = b^x = \par{e^{\ln b}}^x = e^{x \ln b} \pd\] In this composite function, the outer function is \(f(x) = e^x\) and the inner function is \(g(x) = x \ln b.\) Since \(b\) is a constant, \(\ln b\) is also a constant and so \(g'(x) = \ln b.\) And of course, \(f'(x) = e^x.\) Therefore, by \(\eqref{eq:chain-rule}\) we see \[\deriv{}{x} \par{b^x} = e^{x \ln b} \ln b = b^x \ln b \pd\] In words, to differentiate an exponential function whose base isn't \(e,\) we copy the exponential function and multiply it by the natural logarithm of the base. If \(b = e,\) then \(\ln b = 1\) and so the derivative is simply \(e^x.\)

DIFFERENTIATING \(b^x\)
If \(b \gt 0,\) then \begin{equation} \deriv{}{x} \par{b^x} = b^x \ln b \pd \label{eq:diff-b^x} \end{equation}
EXAMPLE 6
\[\deriv{}{x} \par{6^x}\]
From \(\eqrefer{eq:diff-b^x},\) we see \(b = 6\) and so the derivative is \[\boxed{6^x \ln 6}\]

Using the Chain Rule Multiple Times

The function \(y = f(g(h(x)))\) has "three layers" of composition: the outermost layer \(f,\) the middle layer \(g,\) and the innermost layer \(h.\) To differentiate \(y,\) we use the Chain Rule twice, as follows: \begin{align} \deriv{y}{x} &= f'(g(h(x))) \cdot \deriv{}{x} [\orange{g(h(x))}] \nonumber \nl &= f'(g(h(x))) \parbr{\orange{g'(h(x)) \cdot h'(x)}} \nonumber \nl &= f'(g(h(x))) \cdot g'(h(x)) \cdot h'(x) \pd \label{eq:chain-rule-3} \end{align} Or in Leibniz notation, we can write \[\deriv{y}{x} = \deriv{f}{g} \deriv{g}{h} \deriv{h}{x} \pd\] This form explains the name Chain Rule: \(\textderiv{y}{x}\) is given by the product of a chain of derivatives. In general, when a function has \(n\) layers of composition, we use the Chain Rule \(n - 1\) times.

EXAMPLE 7
\[\deriv{}{x} \sin^2(5x - 2)\]
We rewrite the function as \[\parbr{\sin(5x - 2)}^2 \pd\] It is apparent that this function is a composition of three layers; thus, we use the Chain Rule twice. Let us treat \(\sin(5x - 2)\) as one function composed within the squared function. Thus, an initial application of the Chain Rule gives, following \(\eqref{eq:chain-power}\) (the General Power Rule), \[2 \sin(5x - 2) \cdot \deriv{}{x} \parbr{\orange{\sin(5x - 2)}} \pd\] Now we happen to use the Chain Rule again to differentiate \(\sin(5x - 2).\) So our derivative becomes \[2 \sin(5x - 2) \cdot \parbr{\orange{5 \cos(5x - 2)}} = \boxed{10 \sin(5x - 2) \cos(5x - 2)}\]
EXAMPLE 8
\[\deriv{}{x} \sin(\tan(\cos x))\]
This function is a composition of three functions. Treating \(\tan(\cos x)\) as one function composed within the outermost layer sine, the first application of the Chain Rule gives \[\cos(\tan(\cos x)) \cdot \deriv{}{x} \parbr{\orange{\tan(\cos x)}} \pd\] The second application of the Chain Rule yields \[\cos(\tan(\cos x)) \cdot \parbr{\orange{\sec^2(\cos x) \cdot (- \sin x)}} = \boxed{-\cos(\tan(\cos x)) \sec^2(\cos x) \sin x} \]

Proof of the Chain Rule

PROOF Let \(F(x) = f(g(x))\) such that \(g\) is differentiable at \(c\) and \(f\) is differentiable at \(g(c).\) Our goal is to show that \(F'(c) = f'(g(c)) \cdot g'(c).\) By the limit definition of a derivative at a point (see Section 2.1), we assert that \[ \ba F'(c) &= \lim_{x \to c} \frac{F(x) - F(c)}{x - c} \nl &= \lim_{x \to c} \frac{f(g(x)) - f(g(c))}{x - c} \pd \ea \] Our goal is to rewrite this limit to reveal that it's a product of two derivatives. By algebraic manipulation, we obtain \[ F'(c) = \lim_{x \to c} \parbr{\frac{f(g(x)) - f(g(c))}{g(x) - g(c)} \cdot \frac{g(x) - g(c)}{x - c}} \cma \] assuming that \(g(x) \ne g(c).\) Using the Product Law for Limits (from Section 1.2), this limit becomes \[F'(c) = \par{\lim_{x \to c} \frac{f(g(x)) - f(g(c))}{g(x) - g(c)}} \par{\lim_{x \to c} \frac{g(x) - g(c)}{x - c}} \cma\] provided each limit exists. The second limit is the definition of \(g'(c),\) so we have \[F'(c) = \par{\lim_{x \to c} \frac{f(g(x)) - f(g(c))}{g(x) - g(c)}} \cdot g'(c) \pd\] Now we show that the first limit represents the derivative \(f'(g(c)) \col\) Since \(g\) is differentiable at \(c,\) \(g\) is continuous at \(c\) and so \(g(x) \to g(c)\) as \(x \to c.\) If we let \(u = g(x),\) then we find \[ \ba F'(c) &= \par{\lim_{u \to g(c)} \frac{f(u) - f(g(c))}{u - g(c)}} \cdot g'(c) \nl &= f'(g(c)) \cdot g'(c) \pd \ea \] \[\qedproof\]

Chain Rule We use the Chain Rule to differentiate composite functions. If \(g\) is differentiable at \(x\) and \(f\) is differentiable at \(g(x),\) then the derivative of the composite function \(F(x) = f(g(x))\) is given by \begin{equation} F'(x) = f'(g(x)) \cdot g'(x) \pd \eqlabel{eq:chain-rule} \end{equation} In Leibniz notation, if \(y = f(g(x))\) and \(u = g(x),\) then we can write either \begin{flalign} && \deriv{y}{x} &= \deriv{y}{u} \deriv{u}{x} \eqlabel{eq:chain-leib} &\nl \laWord{or} && \deriv{y}{x} &= \deriv{f}{g} \deriv{g}{x} \nonumber \pd \end{flalign} In words, to obtain the derivative of a composite function, we differentiate the outer layer—leaving the inner function alone—and then multiply by the derivative of the inner function.

General Power Rule If \(g\) is a differentiable function and \(n\) is any constant, then \begin{equation} \deriv{}{x} [g(x)]^n = n \parbr{g(x)}^{n - 1} \cdot g'(x) \cma \eqlabel{eq:chain-power} \end{equation} where \([g(x)]^n\) is defined.

Differentiating \(b^x\) The function \(y = b^x\) can be rewritten as the composite function \(y = e^{x \ln b}.\) If \(b \gt 0,\) then the Chain Rule gives \begin{equation} \deriv{}{x} \par{b^x} = b^x \ln b \pd \eqlabel{eq:diff-b^x} \end{equation}

Using the Chain Rule Multiple Times We use the Chain Rule multiple times when a function is composed of more than two layers. In general, if a function is composed of \(n\) layers, then we apply the Chain Rule \(n - 1\) times. For example, the derivative of \(y = f(g(h(x)))\) (composed of three layers) is given by either \begin{flalign} && \deriv{y}{x} &= f'(g(h(x))) \cdot g'(h(x)) \cdot h'(x) \eqlabel{eq:chain-rule-3} &\nl \laWord{or} && \deriv{y}{x} &= \deriv{f}{g} \deriv{g}{h} \deriv{h}{x} \nonumber \pd \end{flalign} You shouldn't memorize \(\eqref{eq:chain-rule-3};\) instead, during your first application of the Chain Rule, we recommend that you view \(g(h(x))\) as a single, ordinary function composed within \(f.\) Be organized in writing out the steps of your differentiation process.