Tensors (summation convention)

In summary, the conversation discusses the use of the chain rule in transforming tensors with covariant and contravariant indices. The question arises in equation 5, where it is unclear how to get from term 3 to term 4. The use of summation results in 3 Kronecker delta's instead of 1, which is not the expected result. The concept of a coordinate system and the chain rule is used to explain the notation, and the error is identified in the use of a double index leading to incorrect use of summation.
  • #1
bremvil
10
0
Hi everyone,

I recently started a course on continuum mechanics. It started with the mathematical background of transforming tensors with contravariant and/or covariant indices. There is one thing I don't understand and it should be really straight forward. I hope you can give me a hint.

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg

In equation 5 on the page that I linked above I do not see how I can get from term 3 to term 4. I end up with 3 kronecker delta's instead of just 1. Since there is a double index 'm' summation can be performed and it should lead to the same result as when you use a 'shortcut'. I can show what I do using 'latex' notation, with ^ = superscript, _ = subscript, d = kronecker delta, t = theta

'term 3' in Equation (5) reads:
(dt^i/dx^m) * (dx^m/dt^j)

in case I decide to do summation this equation will turn into:
(dt^i/dx^1) * (dx^1/dt^j) + (dt^i/dx^2) * (dx^2/dt^j) +
(dt^i/dx^3) * (dx^3/dt^j)

Each component of vector x is a function of all three components of vector theta. And each component of vector theta is a function of all three components of vector x. By the chain rule the last expression would become

dt^i/dt^j + dt^i/dt^j + dt^i/dt^j

this is:
d^i_j + d^i_j + d^i_j = 3 * d^i_j

so in case I decide to do the summation I end up with something different than I
would expect. 3 kronecker delta's instead of 1! Is there any objection to using a sum in this
case?

with kind regards,

Bremvil
 
Physics news on Phys.org
  • #2
That equality is just the chain rule. It doesn't have anything to do with tensors or properties of the Kronecker delta.

When [itex]g:\mathbb R^n\rightarrow\mathbb R^n[/itex] and [itex]f:\mathbb R^n\rightarrow\mathbb R[/itex], I like to write the chain rule like this:

[tex](f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)[/tex]

Here ",i" denotes partial derivative with respect to the ith variable, and [itex]g^j[/itex] is the jth component of the function g. If [itex]f:\mathbb R^n\rightarrow\mathbb R^n[/itex], we have

[tex](f\circ g)^i(x)=(f(g(x))^i=f^i(g(x))=f^i\circ g(x)[/tex]

so

[tex]\delta^i_j=(f\circ f^{-1})^i{}_{,j}(x)=(f^i\circ (f^{-1}))_{,j}(x)[/tex]

Define [itex]g=f^{-1}[/itex] to unclutter the notation somewhat. Then the above is

[tex]=(f^i\circ g)_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(x)[/tex]

and...uhh...I can't explain why this is written in the form

[tex]\frac{\partial A^i}{\partial B^k}\frac{\partial B^k}{\partial A_j}[/tex]

without explaining partial derivatives with respect to a coordinate system. (Edit: Actually I can. See the comments at the end of the post). On a manifold, expressions like f(x+h) don't work, because in general, addition isn't defined for points in the manifold. This is why we have to use a coordinate system to define partial derivatives. A coordinate system is a function [itex]x:U\rightarrow R^n[/itex], where U is an open subset that contains the point p at which we want to define the partial derivative. If f is a function from the manifold to the real numbers, we define

[tex]\frac{\partial f}{\partial x^i}(p)=(f\circ x^{-1})_{,i}(x(p))[/tex]

In particular, if y is another coordinate system,

[tex]\frac{\partial y^j}{\partial x^i}(p)=(y^j\circ x^{-1})_{,i}(x(p))[/tex]

The f above is a coordinate change function, i.e. an expression of the form [itex]A\circ B^{-1}[/itex], where A and B are coordinate systems. So we have

[tex]\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=(A^i\circ B^{-1})_{,k}(g(x))(B^k\circ A^{-1}){}_{,j}(x)=\frac{\partial A^i}{\partial B^k}(A^{-1}(x))\frac{\partial B^k}{\partial A_j}(A^{-1}(x))[/tex]

Edit: It turned out to be easier than I thought to explain the notation. No techniques from differential geometry are needed.

[tex]\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(f(g(x)))=\frac{\partial f^i}{\partial g^k}(g(x))\frac{\partial g^k}{\partial f_j}(f(g(x))[/tex]
 
Last edited:
  • #3
Dear Fredrik,

Thanks for your reply. I read through it carefully but I find it quite difficult, my math background is not as strong as yours. So I'm still not really there yet. In your explanation you started with:

[tex]
(f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)
[/tex]

but this step is basically my entire problem! The index j appears twice which would mean summation right. I will try to write down my original problem with latex. Could you instead maybe tell where I am making the error?

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg
equation 5, the problem is in step from term 3 to term 4.


[tex]
x_i = x_i(\theta_1 ,\theta_2 , \theta_3)
[/tex]

[tex]
\theta_i = \theta_i(x_1 ,x_2 , x_3)
[/tex]

If I decide to apply summation over 'index m' in the equation below I get:

[tex]
\frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} =
\frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}
[/tex]

by the chain rule I get

[tex]
\frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j
[/tex]

So I get 3 times the delta function instead of only 1 delta function. I guess I should not do summation for some reason, but I don't understand why. There is a double 'index m' so a summation should be justified. In every part of the book 'classical and computational solid mechanics' the presence of a double index means summation.
 
  • #4
bremvil said:
[tex]
\frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} =
\frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}
[/tex]

by the chain rule I get

[tex]
\frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j
[/tex]
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

[tex]\frac{\partial\theta^i}{\partial\theta_j}[/tex]

(once, not three times).
 
  • #5
So basically you are saying that the term on the top right:

[tex]
\frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} =
\frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}
[/tex]

equals a single delta function? The way I interpret it, each term within the expression above is a single delta function.
Fredrik said:
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

[tex]\frac{\partial\theta^i}{\partial\theta_j}[/tex]

(once, not three times).
 
  • #6
bremvil said:
The way I interpret it, each term within the expression above is a single delta function.
You're not applying the chain rule correctly. Another example:

[tex]\frac{d}{dx}f(g(x),h(x))=\frac{\partial f}{\partial g}\frac{dg}{dx}+\frac{\partial f}{\partial h}\frac{dh}{dx}\neq \frac{df}{dx}+\frac{df}{dx}[/tex]
 
  • #7
I finally see it! Thanks a lot.
 

Related to Tensors (summation convention)

1. What are tensors and what is the summation convention?

Tensors are mathematical objects that describe the relationship between physical quantities. The summation convention is a shorthand notation used to simplify the representation of tensor equations.

2. How does the summation convention work?

In the summation convention, repeated indices in a tensor equation imply summation over all possible values for that index. This helps to reduce the length and complexity of tensor equations.

3. What is the significance of the summation convention in tensor calculus?

The summation convention is essential in tensor calculus as it allows for concise and elegant representation of complex equations. It also helps to identify and eliminate errors in tensor manipulations.

4. What are the advantages of using the summation convention?

The summation convention simplifies tensor equations, making them easier to read and understand. It also reduces the number of terms in an equation, making it more efficient to work with.

5. Are there any limitations to the summation convention?

While the summation convention is a powerful tool, it can sometimes lead to confusion if not used correctly. It is important to be careful when choosing indices and to properly keep track of them throughout the equation.

Similar threads

  • Linear and Abstract Algebra
Replies
12
Views
1K
  • Linear and Abstract Algebra
Replies
13
Views
2K
Replies
1
Views
3K
  • Calculus
Replies
3
Views
2K
  • Linear and Abstract Algebra
Replies
5
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Linear and Abstract Algebra
Replies
25
Views
1K
  • Introductory Physics Homework Help
Replies
6
Views
2K
  • Special and General Relativity
Replies
10
Views
1K
Replies
8
Views
821
Back
Top