Ehresmann Connection and Gauge Theory

Ehressmann Connection

tldr; Connections on Fiber bundles are are specified by vertical bundle-valued one-forms

One might be familiar with the connection as the Christoffel symbols $Γ^{μ}_{ν ρ} (x)$ or the gauge field $A^{μ} (x)$ . We will review the idea of Ehresmann connection that unifies these 2 seemingly disparate structures in a more abstract framework. It is essential if we are going to generalize the Levi Civita connection (Christoffel symbols) on the tangent bundle to a spin connection on the spin bundle.

The Ehresmann connection can be defined for any smooth Fiber bundle $π : E \to M$ with smooth fiber $F = π^{- 1} (m), m \in M$ . If we consider tangent bundle $d π : TE \to TM$ , one can see that $dim T_{(m, f)} TE = dim E = dim M + dim F = dim T_{m} M + dim F$ , which implies that $dim ker d π = dim F$ . Intuitively, the tangent bundle $TP$ has some linear subspace that are “along $F$ ”, and this leads to the definition of $V E := ker (d π : TE \to TM)$ One can look for another subspace that is complementary to $V E$ , i.e. $TE = V E \oplus H E$ The intuition is that vectors in $H E$ point in a direction that leads to another fiber, while vectors in $V E$ point in a direction tangent to the current fiber. I would love to add a diagram showing an example with $dim M = dim F = 1$ (the only intuitive diagram one can draw in my opinion).

In the absence of any additional structure, there are many valid choices of $H E$ , and our choice of $H E$ affects $ver (X)$ . This is analogous to saying that the vector space of degree 2 polynomials $span (1, x, x^{2})$ can be decomposed in many ways

analogous to TE span (1, x, x^{2}) span (1, x, x^{2}) = V E \oplus H E = span (1) \oplus span (x, x^{2}) \Rightarrow ver (x + 1) = 1 = span (1) \oplus span (x + 1, x^{2}) \Rightarrow ver (x + 1) = 0

Connection Form

Instead of saying we provide a horizontal subspace at every tangent space of the the tangent bundle $TE$ , it is an equivalent definition to just provide a projection map $Φ : TE \to V E$ from which $H E = ker Φ$ can be inferred.

Punchline: One can say that $Φ$ is a $V E$ -valued one-form on $E$ since it takes a vector in $TE$ and gives an element of $V E$ .

Ehressmann Connection on $G$ -bundle

tldr; Connection on $G$ -bundle $P$ is a Lie algebra-valued one-form on $TP$

If we consider $E$ to be a principal $G$ -bundle $P$ , then the $Φ : TE \to V E$ is a Lie group-valued one-form. Since $V E$ consists of vectors that live in the tangent space $T_{(m, g)} E$ that have the same dimension as $dim G$ , the vertical subspace $V_{(m, g)} E ≅ T_{g} G ≅ T_{e} G ≅ g$ is isomorphic to the Lie algebra. So the connection $Φ$ on a $G$ -bundle is specified by a Lie algebra-valued one-form on P (the full bundle $P \to M$ ).

However, it turns out that given a local trivialization $U \times G$ of $P \to M$ , specifying a Lie algebra-valued one-form $ω^{U} : T U \to g$ on $M$ actually specifies enough information to specify $Φ_{U}$ on $π^{- 1} (U)$ , which is then glued together with other charts to obtain $Φ$ on the full bundle $P$ . This might seem wild at first since surely projecting $Φ$ onto $M$ to yield $ω$ will lose information with regard to how this one form behaves on the fibers. The answer is that the condition that there is an action of $G$ on the fibers $G$ is constraining enough to reconstruct $Φ$ given $ω$ . These $ω^{U}$ are known as the Yang-Mills fields. There are alot of details that are covered in the excellent lecture 22 by Frederic Schuller.

Details of Theorem

The main result is a theorem

presented at 21:28 of Schuller Lecture 22 (I follow his notation)
Theorem 10.1 in Nakahara
Wiki

Given a section $σ : U \to P$ which induces canonical local trivialization $h : U \times G \to P$ , a $g$ -valued one-form $ω$ on $P$ can be pulled back via $σ$ or $h$ .

The $σ^{*} ω$ pullback induces the $h^{*} ω$ pullback by the following definition $h^{*} ω_{(m, g)} (v, γ) := Ad_{g^{- 1}} ((σ^{*} ω) (v)) + (L_{g^{- 1}})_{*} (γ)$ where $v \in T_{m} U, γ \in T_{g} G$ .

Theorem: This $h^{*} ω$ satisfies the 2 properties of a connection, namely

$(h^{*} ω) (h_{*}^{- 1} A^{#}) = A$
$(R_{g}^{*} (h^{*} ω)) (h_{*}^{- 1} X) = Ad_{g^{- 1}} (h^{*} ω (h_{*}^{- 1} X))$

where $A \in g$ and $A^{#} : P \to TP$ is the fundamental vector field on $P$ generated by $A$ , defined pointwise as $A_{p \in P}^{#} := \frac{d}{d t} R_{e x p (t A)} (p)_{t = 0} \in V_{p} P \subset T_{p} P$

Proof of $(h^{*} ω) (h_{*}^{- 1} A^{#}) = A$ :

Since $A^{#} \in V_{p} P$ , the pushforward via $h$ yields $h_{*} A^{#} = (0, γ) \in T_{m} U \times T_{g} G$ , where $γ$ is $γ = \frac{d}{d t} g exp (t A)_{t = 0}$ where $(m, g) = h^{- 1} (p)$ . It is clear from this that $(L_{g^{- 1}})_{*} γ = A$ .

This condition is basically saying that the inverse pullback $h^{- 1 *} h^{*} ω = ω$ satisfies the connection one-form condition $ω (A^{#}) = A$ .

Proof of $(R_{g}^{*} (h^{*} ω)) (h_{*}^{- 1} X) = Ad_{g^{- 1}} (h^{*} ω (h_{*}^{- 1} X))$ :

To prove this,

LHS = (R_{k}^{*} (h^{*} ω)) (h_{*}^{- 1} X) = h^{*} ω (R_{k *} h_{*}^{- 1} X) = (h^{*} ω)_{(m, g k)} (v, R_{k *} γ) = Ad_{(g k)^{- 1}} (σ^{*} ω (v)) + L_{(g k)^{- 1} * R_{k *} γ} = Ad_{k^{- 1}} Ad_{g^{- 1}} (σ^{*} ω (v)) + L_{k^{- 1} *} R_{k *} L_{g^{- 1} *} γ = Ad_{k^{- 1}} (Ad_{g^{- 1}} (σ^{*} ω (v)) + L_{g^{- 1} *} γ) = Ad_{k^{- 1}} (h^{*} ω (h_{*}^{- 1} X)) = RHS

So the above calculations has shown that given a $g$ -valued one-form $ω^{U}$ on $U$ and a section $σ : U \to π^{- 1} (U)$ , there exists a unique connection one-form $ω$ on $π^{- 1} (U)$ such that $ω^{U} = σ^{*} ω$ . The constructed connection depends on $σ$ .

$\Leftarrow$ construction: Split $P$ into charts $U^{i}$ . Then each local trivialization induces a section $σ_{i} : M \to P$ . The pullback of $ω$ on $P$ via $σ_{i}$ will give an $ω^{U_{i}}$ on $U_{i}$ , which is a Yang-mills field $ω^{M}$ on the base manifold $M$ after gluing. The gluing is what we call “gauge transformations of $A$ ” or “how $Γ$ transforms under change in coordinates”.

$\Rightarrow$ construction: Split $P$ into charts $U^{i}$ . On each chart, the one-form $ω^{U^{i}}$ on $U^{i}$ induces a one-form (which is actually $h^{*} ω$ ) on the local trivialisation $h$ of $π^{- 1} (U^{i})$ via the theorem above. These $h^{*} ω$ can be glued together to construct a one-form $ω$ on $P$ .

Clearing the Ambiguity in Nakahara’s Notation

I learned alot from Nakahara’s book, but I find their section on connection one-form slightly ambiguous. For the connection one-form on $P$ , they write $ω_{i} \equiv g_{i}^{- 1} π^{*} A_{i} g_{i} + g_{i}^{- 1} d g_{i}$ where $A_{i}$ is the Yang Mills one-form on the base manifold $M$ . The first term $g_{i}^{- 1} π^{*} A_{i} g_{i}$ actually means $Ad_{g_{i}^{- 1}} (π^{*} A_{i} (X))$ where $Ad_{g_{i}^{- 1}} : T_{e} G \to T_{e} G$ is the adjoint action of $G$ on $g$ , and $X \in T_{p} P$ . The $Ad_{g^{- 1}} (Y)$ and $g^{- 1} Y g, Y \in g$ notations coincide for matrix Lie groups, where the multiplication between $g$ and $Y$ is matrix multiplication. I still prefer to talk purely in non-matrix notation because I find it’s clearer.

Because $X = H + V$ and $V \in ker π^{*}$ , the above actually reduces to

$Ad_{g_{i}^{- 1}} (A_{i} (v))$

where $v \in T_{m} U^{i} ≅ H_{p} P$ is a horizontal vector. The first term takes care of the vertical component of $X$ .

The second term is extremely ambiguous. $g_{i}^{- 1}$ is simply just $L_{g^{- 1} *}$ . Trouble comes from $d g_{i}$ . Usually one uses $d g$ to denote the pushforward of $g$ . But in this case, $d g$ actually means $γ \in V_{p} P ≅ T_{g} G$ . If I include the implicit annotations, $ω_{i}_{(m, g)} (X) \equiv Ad_{g_{i}^{- 1}} (A_{i} (hor (X))) + L_{g^{- 1} *} (ver (X))$ So how does one justify $d g = γ$ ? $g$ is interpreted as a map $R \to G$ in the local trivialization $h^{- 1} (p) = (m, g)$ . A vector is the derivative of a curve at one point of the curve, so given a curve $C$ in $P$ which leads to vector $X = \frac{d C}{d t}$ , we actually mean to define $d g \equiv \frac{d}{d t} g (C (t)) \in V_{p} P$ i.e. derivative of the $G$ fiber “coordinate” of our curve $C (t) = (m (t), g (t))$ . $d g : TM \to V P$ is the pushforward of the map $g : M \to P$ . So $d g (π^{*} X) = ver (X)$ .

tldr; $d g$ is $ver (X)$ .

Gluing of Connection One-Form

Given 2 overlapping local trivializations $U^{i} \cap U^{j}$ (which induces 2 local sections $σ_{i}, σ_{j} : U^{i} \cap U^{j} \to P$ ), we would like to find how $σ_{i}^{*} ω$ is related to $σ_{j}^{*} ω$ . The answer is obtained by investigating how they act on a vector $X \in T_{m} (U^{i} \cap U^{j})$ . We need to calculate $σ_{i *} X \in T_{σ_{i} (m)} P$ in terms of $σ_{j *} X \in T_{σ_{j} (m)} P$ . Or in terms of the local trivialisation $(h_{i} \circ σ_{i})_{*} X \in T_{(m, e_{i})} U^{i} \times G$ and $(h_{j} \circ σ_{j})_{*} X \in T_{(m, e_{j})} U^{j} \times G$ .

We know that $t_{ij} : U^{i} \cap U^{j} \to G$ is the transition map defined as the unique (guaranteed since $G$ acts freely and transitively on the fibers of $P$ ) element $t_{ij} (m) \in G$ such that $R_{t_{ij} (m)} (p) = h_{j} \circ h_{i}^{- 1} (p)$ This definition can be rephrased in a more friendly manner as

h_{i}^{- 1} (p) = Denoting h^{- 1} (p) := g_{i} (p) = h_{j}^{- 1} (p) ∙_{R} t_{ij} (m) (m, g (p)), g_{j} (p) ∙_{G} t_{ij} (m)

Repeating the derivation with $i \leftrightarrow j$ swapped, we get that $t_{ij} = t_{ji}^{- 1}$

If we let $p = σ_{j} (m)$ , then the last equation becomes $g_{i} (σ_{j} (m)) = e ∙_{G} t_{ij} (m) = t_{ij} (m)$ which translates the definition into a statement about transition maps between sections

$σ_{j} (m) = σ_{i} (m) ∙_{R} t_{ij} (m)$

Suppose we have a curve $γ (t)$ in $M$ that yields the vector $T_{m} M ∋ X = \frac{d γ}{d t} ∣_{t = 0}$ at $m := γ (0)$ . This vector acts on a function $f : M \to R$ . The pushforward is given by

σ_{j *} (X) (f) = \frac{d}{d t} f [σ_{j} (γ (t))]_{t = 0} = \frac{d}{d t} f [σ_{i} (γ (t)) ∙_{R} t_{ij} (γ (t))]_{t = 0} = \frac{d}{d t} f [σ_{i} (γ (t)) ∙_{R} t_{ij} (m)]_{t = 0} + \frac{d}{d t} f [σ_{i} (m) ∙_{R} t_{ij} (γ (t))]_{t = 0} = \frac{d}{d t} (f \circ R_{t_{ij} (m)}) [σ_{i} (γ (t))]_{t = 0} + \frac{d}{d t} f [σ_{i} (m) ∙_{R} exp (lo g t_{ij} (γ (t)))]_{t = 0} = (\frac{d}{d t} σ_{i} (γ (t))_{t = 0}) (f \circ R_{t_{ij} (m)}) + (lo g t_{ij} (m))_{σ_{i} (m)}^{#} f = R_{t_{ij} (m) *} (\frac{d}{d t} σ_{i} (γ (t))_{t = 0}) f + (L_{t_{ij}^{- 1} *} \frac{d}{d t} t_{ij} (γ (t))_{t = 0})_{σ_{i} (m) ∙_{R} t_{ij} (m)}^{#} f = R_{t_{ij} (m) *} σ_{i *} (\frac{d}{d t} γ (t)_{t = 0}) f + (L_{t_{ij}^{- 1} *} t_{ij *} \frac{d}{d t} γ (t)_{t = 0})_{σ_{j} (m)}^{#} f = (R_{t_{ij} (m) *} σ_{i *} X) f + (L_{t_{ij}^{- 1} *} t_{ij *} X)_{σ_{j} (m)}^{#} f = (R_{t_{ij} (m) *} σ_{i *} X + (L_{t_{ij}^{- 1} *} t_{ij *} X)_{σ_{j} (m)}^{#}) f

so the vector fields are related as follows $σ_{j *} X = R_{t_{ij} *} (σ_{i *} X) + (L_{t_{ij}^{- 1} *} t_{ij *} X)^{#}$

Note that each term is responsible for the horizontal $H$ and vertical $V$ component of the vector $X = H + V$ . Right $G$ -invariance of the horizontal bundle $R_{t_{ij} *} : H_{σ_{i} (m)} P \to H_{σ_{j} (m)} P$ means horizontal gets mapped onto horizontal. The $Y^{#}$ vector field generated from an element $Y$ of the Lie algebra $g$ lives in the vertical bundle too, so vertical maps onto vertical.

This allows us to relate $σ_{i}^{*} ω$ to $σ_{j}^{*} ω$ , which is what we call “gauge transformations” of $A$ .

σ_{j}^{*} ω (X) = ω (σ_{j *} X) = ω (R_{t_{ij} *} (σ_{i *} X) + ω ((L_{t_{ij}^{- 1} *} t_{ij *} X)^{#}) = (R_{t_{ij}}^{*} ω) (σ_{i *} X) + L_{t_{ij}^{- 1} *} t_{ij *} X = Ad_{t_{ij}^{- 1}} (ω (σ_{i *} X)) + L_{t_{ij}^{- 1} *} t_{ij *} X = Ad_{t_{ij}^{- 1}} (σ_{i}^{*} ω (X)) + L_{t_{ij}^{- 1} *} t_{ij *} X

So this is the mathematically precise expression for the $A \mapsto g^{- 1} A g + g^{- 1} d g$

Comparison with Frederic Schuller Lecture 22

In case you were wondering how to match the result we derived with Schuller’s Lecture 22 timestamp 59:08, where he says that $ω_{m}^{U^{(2)}} = Ad_{Ω^{- 1} (m)} \circ ω^{U^{(1)}} + Ω^{*} Ξ$

In our case $ω$ is a $g$ -valued one-form on $P$ whereas his $ω$ is a one-form on $M$ . So our $σ_{j}^{*} ω$ is his $ω^{U^{(j)}}$ .
His $Ω$ is our $t_{ij}$
His Maurer Cartan form $Ξ : T_{g} G \to T_{e} G ≅ g$ is just $L_{g^{- 1} *}$ , in this case $g$ is $Ω$ . For our expression to match his, we just rewrite $L_{t_{ij}^{- 1} *} (t_{ij *} X)$ as $(t_{ij}^{*} L_{t_{ij}^{- 1} *}) X$ , i.e. we pullback the $g$ -valued one-form on $G$ $L_{t_{ij}^{- 1} *} : T_{t_{ij}} G \to T_{e} G ≅ g$ via $t_{ij} : M \to G$ to yield a $g$ -valued one-form on $M$ $t_{ij}^{*} L_{t_{ij}^{- 1} *} : T_{m} M \to T_{e} G ≅ g$ So our $t_{ij}^{*} L_{t_{ij}^{- 1} *}$ is identified with his $Ω^{*} Ξ$

Calculating Christoffel Symbols

Christoffel symbols $Γ$ is just the connection one-form $ω : T_{p} M \to g$ on the base manifold of a tangent frame bundle (which can be shown to be a $G L (d)$ -principal bundle). The goal will be to calculate how $Γ$ changes when we change coordinates.

The nice thing is that a coordinate chart $x : U^{(1)} \to R^{d}$ on $U^{(1)} \subset M$ already defines a local trivialization of the $G L (d)$ -bundle. Namely, the $d$ coordinate functions $x^{i} : U^{(1)} \to R$ can be used to define a section $σ^{(1)} : U^{(1)} \to F (T_{σ^{(1)} (x)} M)$

$σ^{(1)} (x) = (\frac{\partial}{\partial x ^{1}}, \frac{\partial}{\partial x ^{2}}, ..., \frac{\partial}{\partial x ^{d}})$

A vector $X \in T_{p} M$ can be written as $X = X^{μ} \partial_{μ}$ , with $μ = 1, ..., dim M$ . Hence the “Yang-Mills” one-form on the base manifold $M$ takes the form $σ^{(1)} ω := Γ_{μ} d x^{μ}$ , where $Γ_{μ} \in g$ .

Since $G = G L (d)$ , the Lie algebra is just $g = gl (d)$ , consisting of all $d \times d$ real matrices. So $Γ_{μ} = Γ^{i}_{j μ}$ where $i, j = 1, ..., d$ are indices for the matrix Lie algebra $gl (d)$ .

We are interested in the transformations laws for this $Γ$ . Suppose we have another chart $y : U^{(2)} M \to R^{d}$ . Then on the intersection $U^{(1)} \cap U^{(2)}$ , if $γ (t) \in M$ is a curve with tangent vector $X = \overset{γ}{˙} (0)$ ,

σ_{j}^{*} ω (X) Γ_{μ}^{(2)} d y^{μ} (X) = Ad_{t_{ij}^{- 1}} (σ_{i}^{*} ω (X)) + L_{t_{ij}^{- 1} *} t_{ij *} X = Ω^{- 1} (Γ_{μ}^{(1)} d x^{μ} (X)) Ω + Ω^{- 1} \frac{d}{d t} Ω (γ (t))_{t = 0}

The last term can be rewritten as

$Ω^{- 1} \frac{\partial Ω ( x )}{\partial x ^{μ}} \frac{d x ^{μ} ( γ ( t ))}{d t} = Ω^{- 1} \frac{\partial Ω ( x )}{\partial x ^{μ}} X^{μ}$

So with all the indices flashed out,

Γ^{(2) i}_{j μ} d y^{μ} (X) = (Ω^{- 1})^{i}_{k} (Γ^{(1) k}_{l μ} d x^{μ} (X)) Ω^{l}_{j} + (Ω^{- 1})^{i}_{k} \partial_{μ} Ω^{k}_{j} X^{μ}

The vector $X = X^{μ} \frac{\partial}{\partial x ^{μ}}$ expressed in the 2nd chart is $X = \tilde{X}^{μ} \frac{\partial}{\partial y ^{μ}}$ , with $X^{μ} = \frac{\partial x ^{μ}}{\partial y ^{ν}} \tilde{X}^{ν}$ So

$d x^{μ} (X) = X^{μ} = \frac{\partial x ^{μ}}{\partial y ^{ν}} \tilde{X}^{ν}$

Putting this back into the above expression,

{\Gamma^{(2) i}}_{j\mu} \tilde X^\mu &= {(\Omega^{-1})^i}_{k} \left( {\Gamma^{(1)k}}_{l\mu} \frac{\partial x^\mu}{\partial y^\nu}\tilde X^\nu \right) {\Omega^l}_j + {(\Omega^{-1})^i}_k\partial_\mu {\Omega^k}_j X^\mu \end{aligned}$$ Now the transition functions $\Omega:M\rightarrow GL(d)$, or in coordinates, ${\Omega^i}_j: M \rightarrow \mathbb R$ are actually $${\Omega^i}_j = \frac{\partial x^i}{\partial y^j}$$ $${(\Omega^{-1})^i}_j = \frac{\partial y^i}{\partial x^j}$$ so $$\begin{aligned} {\Gamma^{(2) i}}_{j\mu} \tilde X^\mu &= \frac{\partial y^i}{\partial x^k} \left( {\Gamma^{(1)k}}_{l\mu} \frac{\partial x^\mu}{\partial y^\nu}\tilde X^\nu \right) \frac{\partial x^l}{\partial y^j} + \frac{\partial y^i}{\partial x^k} \frac{\partial^2 x^k}{\partial y^\mu \partial y^j} \tilde X^\mu \end{aligned}$$ Removing the $X$ (more rigorously, choosing $X=\delta^{\mu\rho} \frac{\partial}{\partial y^\rho}$), $$\begin{aligned} {\Gamma^{(2) i}}_{j\rho} &= {\Gamma^{(1)k}}_{l\mu} \frac{\partial y^i}{\partial x^k} \frac{\partial x^\mu}{\partial y^\rho} \frac{\partial x^l}{\partial y^j} + \frac{\partial y^i}{\partial x^k} \frac{\partial^2 x^k}{\partial y^\rho \partial y^j} \end{aligned}$$ Not all indices of $\Gamma$ are the same type! 1 of them is spacetime, the other 2 actually label the Lie algebra. ### Memorisation Trick When we learn this in GR we usually get giddy just trying to write it with the correct indices. However, if we write it as follows and compare with the abstract form, it becomes much clearer what is really going on $$\begin{aligned} {\Gamma^{(2) i}}_{j\rho} dy^\rho &= \frac{\partial y^i}{\partial x^k} \left({\Gamma^{(1)k}}_{l\mu} \frac{\partial x^\mu}{\partial y^\rho} dy^\rho \right) \frac{\partial x^l}{\partial y^j} + \frac{\partial y^i}{\partial x^k} \frac{\partial}{\partial y^\rho } \left(\frac{\partial x^k}{\partial y^j} \right) dy^\rho\\ \mathbf \Gamma^{(2)}_\rho dy^\rho &= \mathbf\Omega^{-1} \left(\mathbf \Gamma^{(1)}_\mu dx^\mu\right) \mathbf \Omega + \mathbf \Omega^{-1} d \mathbf \Omega \end{aligned}$$

twigslot

Explorer

Ehresmann Connection and Gauge Theory

Ehressmann Connection

Connection Form

Ehressmann Connection on $G$ -bundle

Details of Theorem

Clearing the Ambiguity in Nakahara’s Notation

Gluing of Connection One-Form

Comparison with Frederic Schuller Lecture 22

Calculating Christoffel Symbols

Graph View

Table of Contents

Backlinks

twigslot

Explorer

Ehresmann Connection and Gauge Theory

Ehressmann Connection

Connection Form

Ehressmann Connection on G-bundle

Details of Theorem

Clearing the Ambiguity in Nakahara’s Notation

Gluing of Connection One-Form

Comparison with Frederic Schuller Lecture 22

Calculating Christoffel Symbols

Graph View

Table of Contents

Backlinks

Ehressmann Connection on $G$ -bundle