Chain Rule

Now, let's see how to differentiate composite functions.

Suppose that we are given a function h(x)=f(g(x)){h}{\left({x}\right)}={f{{\left({g{{\left({x}\right)}}}\right)}}}. Remembering that g(x){g{'}}{\left({x}\right)} is the rate of change of g(x){g{{\left({x}\right)}}} with respect to x{x} and f(g(x)){f{'}}{\left({g{{\left({x}\right)}}}\right)} is the rate of change of f{f{}} with respect to g(x){g{{\left({x}\right)}}}, it is reasonable to suggest that the rate of change of f{f{}} with respect to x{x} is the product of f(g(x)){f{'}}{\left({g{{\left({x}\right)}}}\right)} and g(x){g{'}}{\left({x}\right)}.

Indeed, if g{g{}} changes twice as fast as x{x} and f{f{}} changes three times as fast as g{g{}}, we can state that f{f{}} changes six times as fast as x{x}.

Chain Rule. If f{f{}} and g{g{}} are both differentiable and h=fg{h}={f{\circ}}{g{}} is a composite function defined by h(x)=f(g(x)){h}{\left({x}\right)}={f{{\left({g{{\left({x}\right)}}}\right)}}}, we have that h{h} is differentiable and h(x)=f(g(x))g(x){h}'{\left({x}\right)}={f{'}}{\left({g{{\left({x}\right)}}}\right)}{g{'}}{\left({x}\right)}.

Proof. Recall that if y=f(x){y}={f{{\left({x}\right)}}} and x{x} changes from a{a} to a+Δx{a}+\Delta{x}, the increment of y{y} is Δy=f(a+Δx)f(a)\Delta{y}={f{{\left({a}+\Delta{x}\right)}}}-{f{{\left({a}\right)}}}. According to the definition of derivative, limΔx0ΔyΔx=f(a)\lim_{{\Delta{x}\to{0}}}\frac{{\Delta{y}}}{{\Delta{x}}}={f{'}}{\left({a}\right)}.

So, if we denote by ϵ\epsilon the difference between the difference quotient and the derivative, we obtain that limΔx0ϵ=limΔx0(ΔyΔxf(a))=f(a)f(a)=0\lim_{{\Delta{x}\to{0}}}\epsilon=\lim_{{\Delta{x}\to{0}}}{\left(\frac{{\Delta{y}}}{{\Delta{x}}}-{f{'}}{\left({a}\right)}\right)}={f{'}}{\left({a}\right)}-{f{'}}{\left({a}\right)}={0}.

But ϵ=ΔyΔxf(a)\epsilon=\frac{{\Delta{y}}}{{\Delta{x}}}-{f{'}}{\left({a}\right)}, or Δy=f(a)Δx+ϵΔx\Delta{y}={f{'}}{\left({a}\right)}\Delta{x}+\epsilon\Delta{x}.

Thus, for any differentiable function f{f{}}, Δy=f(a)Δx+ϵΔx\Delta{y}={f{'}}{\left({a}\right)}\Delta{x}+\epsilon\Delta{x}, where ϵ0\epsilon\to{0} as Δx0\Delta{x}\to{0}.

Now, suppose u=g(x){u}={g{{\left({x}\right)}}} is differentiable at a{a} and y=f(u){y}={f{{\left({u}\right)}}} is differentiable at b=g(a){b}={g{{\left({a}\right)}}}. If Δx\Delta{x} is an increment in x{x} and Δu\Delta{u} and Δy\Delta{y} are the corresponding increments in u{u} and y{y}, we have that:

Δu=g(a)Δx+ϵ1Δx=(g(a)+ϵ1)Δx\Delta{u}={g{'}}{\left({a}\right)}\Delta{x}+\epsilon_{{1}}\Delta{x}={\left({g{'}}{\left({a}\right)}+\epsilon_{{1}}\right)}\Delta{x}, where ϵ10\epsilon_{{1}}\to{0} as Δx0\Delta{x}\to{0}.

Δy=f(b)Δu+ϵ2Δu=(f(b)+ϵ2)Δu\Delta{y}={f{'}}{\left({b}\right)}\Delta{u}+\epsilon_{{2}}\Delta{u}={\left({f{'}}{\left({b}\right)}+\epsilon_{{2}}\right)}\Delta{u}, where ϵ20\epsilon_{{2}}\to{0} as Δu0\Delta{u}\to{0}.

Now, substitute the expression for Δu\Delta{u} in the last equation:

Δy=(f(b)+ϵ2)(g(a)+ϵ1)Δx\Delta{y}={\left({f{'}}{\left({b}\right)}+\epsilon_{{2}}\right)}{\left({g{'}}{\left({a}\right)}+\epsilon_{{1}}\right)}\Delta{x},

Or

ΔyΔx=(f(b)+ϵ2)(g(a)+ϵ1)\frac{{\Delta{y}}}{{\Delta{x}}}={\left({f{'}}{\left({b}\right)}+\epsilon_{{2}}\right)}{\left({g{'}}{\left({a}\right)}+\epsilon_{{1}}\right)}

As Δx0\Delta{x}\to{0}, it can be stated that Δu0\Delta{u}\to{0}. So, both ϵ10\epsilon_{{1}}\to{0} and ϵ20\epsilon_{{2}}\to{0} as Δx0\Delta{x}\to{0}.

Therefore, dydx=limΔx0((f(b)+ϵ2)(g(a)+ϵ1))=f(b)g(a)=f(g(a))g(a)\frac{{{d}{y}}}{{{d}{x}}}=\lim_{{\Delta{x}\to{0}}}{\left({\left({f{'}}{\left({b}\right)}+\epsilon_{{2}}\right)}{\left({g{'}}{\left({a}\right)}+\epsilon_{{1}}\right)}\right)}={f{'}}{\left({b}\right)}{g{'}}{\left({a}\right)}={f{'}}{\left({g{{\left({a}\right)}}}\right)}{g{'}}{\left({a}\right)}.

In Leibniz's notation, if y=f(u){y}={f{{\left({u}\right)}}} and u=g(x){u}={g{{\left({x}\right)}}} are both differentiable, dydx=dydududx\frac{{{d}{y}}}{{{d}{x}}}=\frac{{{d}{y}}}{{{d}{u}}}\frac{{{d}{u}}}{{{d}{x}}}.

In Leibniz's notation, it is especially easy to remember the chain rule, because if dydu\frac{{{d}{y}}}{{{d}{u}}} and dudx\frac{{{d}{u}}}{{{d}{x}}} were quotients, we could cancel du{d}{u}. Remember, however, that du{d}{u} has not been defined and dudx\frac{{{d}{u}}}{{{d}{x}}} should not be thought of as an actual quotient.

Example 1. Find the derivative of h(x)=x2+1{h}{\left({x}\right)}=\sqrt{{{{x}}^{{2}}+{1}}}.

Here, f(u)=u{f{{\left({u}\right)}}}=\sqrt{{{u}}}, g(x)=x2+1{g{{\left({x}\right)}}}={{x}}^{{2}}+{1}, and h(x)=f(g(x)){h}{\left({x}\right)}={f{{\left({g{{\left({x}\right)}}}\right)}}}; therefore,

f(u)=(u)=12u{f{'}}{\left({u}\right)}={\left(\sqrt{{{u}}}\right)}'=\frac{{1}}{{{2}\sqrt{{{u}}}}}, and g(x)=(x2+1)=2x{g{'}}{\left({x}\right)}={\left({{x}}^{{2}}+{1}\right)}'={2}{x}.

So, h(x)=f(g(x))g(x)=f(x2+1)2x=12x2+12x=xx2+1{h}'{\left({x}\right)}={f{'}}{\left({g{{\left({x}\right)}}}\right)}{g{'}}{\left({x}\right)}={f{'}}{\left(\sqrt{{{{x}}^{{2}}+{1}}}\right)}{2}{x}=\frac{{1}}{{{2}\sqrt{{{{x}}^{{2}}+{1}}}}}{2}{x}=\frac{{x}}{{\sqrt{{{{x}}^{{2}}+{1}}}}}.

When using the chain rule, we work from the outside to the inside. We differentiate the outer function [at the inner function g(x)g(x)] and then we multiply by the derivative of the inner function.

Example 2. Differentiate y=cos(x3){y}={\cos{{\left({{x}}^{{3}}\right)}}} and y=(cos(x))3{y}={{\left({\cos{{\left({x}\right)}}}\right)}}^{{3}}.

If y=cos(x3){y}={\cos{{\left({{x}}^{{3}}\right)}}}, the outer function is a cosine and the inner is a cubic function; so, y=sin(x3)(x3)=3x2sin(x3){y}'=-{\sin{{\left({{x}}^{{3}}\right)}}}\cdot{\left({{x}}^{{3}}\right)}'=-{3}{{x}}^{{2}}{\sin{{\left({{x}}^{{3}}\right)}}}.

If y=(cos(x))3{y}={{\left({\cos{{\left({x}\right)}}}\right)}}^{{3}}, the outer function is cubic and the inner is a cosine; so, y=3(cos(x))2(cos(x))=3(cos(x))2sin(x){y}'={3}{{\left({\cos{{\left({x}\right)}}}\right)}}^{{2}}\cdot{\left({\cos{{\left({x}\right)}}}\right)}'=-{3}{{\left({\cos{{\left({x}\right)}}}\right)}}^{{2}}{\sin{{\left({x}\right)}}}.

One more example.

Example 3. Differentiate y=(x2+1)7{y}={{\left({{x}}^{{2}}+{1}\right)}}^{{7}}.

y=7(x2+1)6(x2+1)=7(x2+1)62x=14x(x2+1)6{y}'={7}{{\left({{x}}^{{2}}+{1}\right)}}^{{6}}\cdot{\left({{x}}^{{2}}+{1}\right)}'={7}{{\left({{x}}^{{2}}+{1}\right)}}^{{6}}\cdot{2}{x}={14}{x}{{\left({{x}}^{{2}}+{1}\right)}}^{{6}}.

Let's work another example.

Example 4. Differentiate y=(2t+3t5)8{y}={{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}}^{{8}}.

Here, we use the chain rule and the quotient rule.

y=8(2t+3t5)81(2t+3t5)=8(2t+3t5)7(2t+3)(t5)(2t+3)(t5)(t5)2={y}'={8}{{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}}^{{{8}-{1}}}\cdot{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}'={8}{{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}}^{{7}}\frac{{{\left({2}{t}+{3}\right)}'{\left({t}-{5}\right)}-{\left({2}{t}+{3}\right)}{\left({t}-{5}\right)}'}}{{{\left({t}-{5}\right)}}^{{2}}}=

=8(2t+3t5)72(t5)(2t+3)(t5)2=8(2t+3t5)713(t5)2=104(2t+3)7(t5)9={8}{{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}}^{{7}}\frac{{{2}{\left({t}-{5}\right)}-{\left({2}{t}+{3}\right)}}}{{{\left({t}-{5}\right)}}^{{2}}}={8}{{\left(\frac{{{2}{t}+{3}}}{{{t}-{5}}}\right)}}^{{7}}\frac{{-{13}}}{{{\left({t}-{5}\right)}}^{{2}}}=-{104}\frac{{{{\left({2}{t}+{3}\right)}}^{{7}}}}{{{{\left({t}-{5}\right)}}^{{9}}}}.

This is clear. Let's do a more complex one.

Example 5. Find the derivative of f(x)=(3x2+4x+1)5(ex+sin(x))2{f{{\left({x}\right)}}}={{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{5}}{{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}}^{{2}}.

We need to use the product rule together with the chain rule.

f(x)=((3x2+4x+1)5)(ex+sin(x))2+(3x2+4x+1)5((ex+sin(x))2)={f{'}}{\left({x}\right)}={\left({{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{5}}\right)}'{{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}}^{{2}}+{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{5}}{\left({{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}}^{{2}}\right)}'=

=5(3x2+4x+1)4(3x2+4x+1)(ex+sin(x))2+(3x2+4x+1)52(ex+sin(x))(ex+sin(x))=={5}{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{4}}\cdot{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}'{{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}}^{{2}}+{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{5}}{2}{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}'=

=5(3x2+4x+1)4(6x+4)(ex+sin(x))2+(3x2+4x+1)52(ex+sin(x))(ex+cos(x))=={5}{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{4}}\cdot{\left({6}{x}+{4}\right)}{{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}}^{{2}}+{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{5}}{2}{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}{\left({{e}}^{{x}}+{\cos{{\left({x}\right)}}}\right)}=

=2(3x2+4x+1)4(ex+sin(x))(5(3x+2)(ex+sin(x))+(3x2+4x+1)(ex+cos(x)))={2}{{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}}^{{4}}{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}{\left({5}{\left({3}{x}+{2}\right)}{\left({{e}}^{{x}}+{\sin{{\left({x}\right)}}}\right)}+{\left({3}{{x}}^{{2}}+{4}{x}+{1}\right)}{\left({{e}}^{{x}}+{\cos{{\left({x}\right)}}}\right)}\right)}.

Now, let's see how to use the chain rule more than once.

Example 6. Differentiate f(t)=ecos(2t){f{{\left({t}\right)}}}={{e}}^{{{\cos{{\left({2}{t}\right)}}}}}.

We apply the chain rule twice.

f(t)=(ecos(2t))=ecos(2t)(cos(2t))=ecos(2t)(sin(2t))(2t)={f{'}}{\left({t}\right)}={\left({{e}}^{{{\cos{{\left({2}{t}\right)}}}}}\right)}'={{e}}^{{{\cos{{\left({2}{t}\right)}}}}}\cdot{\left({\cos{{\left({2}{t}\right)}}}\right)}'={{e}}^{{{\cos{{\left({2}{t}\right)}}}}}\cdot{\left(-{\sin{{\left({2}{t}\right)}}}\right)}\cdot{\left({2}{t}\right)}'=

=2sin(2t)ecos(2t)=-{2}{\sin{{\left({2}{t}\right)}}}{{e}}^{{{\cos{{\left({2}{t}\right)}}}}}.

And our final example.

Example 7. Differentiate f(x)=cos(sin(tan(x))){f{{\left({x}\right)}}}={\cos{{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}}}.

Here we apply the chain rule twice again.

f(x)=(cos(sin(tan(x))))=sin(sin(tan(x)))(sin(tan(x)))={f{'}}{\left({x}\right)}={\left({\cos{{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}}}\right)}'=-{\sin{{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}}}\cdot{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}'=

=sin(sin(tan(x)))(cos(tan(x)))(tan(x))==-{\sin{{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}}}\cdot{\left({\cos{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}\cdot{\left({\tan{{\left({x}\right)}}}\right)}'=

=sec2(x)cos(tan(x))sin(sin(tan(x)))=-{{\sec}}^{{2}}{\left({x}\right)}{\cos{{\left({\tan{{\left({x}\right)}}}\right)}}}{\sin{{\left({\sin{{\left({\tan{{\left({x}\right)}}}\right)}}}\right)}}}.

In general, we can apply the chain rule even more than two times. We should use it as many times as we need.