Partitions: The Most Powerful Tool In Mathematics

March 8, 2013

If you have a big object and cannot understand it, what do you do? You break the big object into little objects, try to make sense of them and then build up back to the big object again.

These little objects are partitions of the big object.

Let us see how this tool has been used in some areas of mathematics:

Integration (Analysis)

The first and original integral, the Riemann integral uses partitions of a set and then creates specific sums (Riemann Sum) of these partitions which satisfy certain criterion for integrability.

The result? The integral and in general integration and of course The Fundamental Theorem Of Calculus.

Group Theory (Algebra)

Suppose you have a group. Represent this group as a big box. The things inside this big box clearly make it what it is: the big box.

Partition these things inside into sets and if they have some structure (subgroups), collect all of them and consider the divisibility of the cardinality of the partitions.

What is inside one box may not be what is inside another box (in group talk: a left coset is not necessarily equal to the right coset associated to the group)

The result? Lagrange’s theorem (there are other results but the sheer beauty of that one should be enough)

Graph Theory (Algebra)

Suppose you have a graph. Partition the graph into something smaller: a collection of subgraphs. If this collection is a disjoint union of n-partite graphs, we can understand the graph’s structure.

A specific structure will allow us to understand if the graph is planar and to resolve key, real-life problems such as the utilities problem.

The result: Euler’s characteristic, Four-Colour Theorem, Perron-Frobenius Theorem.

Stochastic Processes (Probability)

Suppose you are doing some event B contained in a sample space W. But to do B you must “be” in some place, say, A. Maybe you are in A_0, A_1, A_2, … and so on. The point is, you must be there. You partition the chance of event B occurring as sum of the chance of B occurring given that you are in A multiplied independently by the chance of you being in A.

This is the Law Of Total Probability.

The result: Chapman-Kolmogorov equation.

Actually this theorem can be represented into a matrix form which then allows us to produce big theorems for Markov chains and (in general) Markov processes.

The result: Markov chains, Ergodic theorem

Even more, the whole of probability theory arises from this.

Introductory stochastic processes classes look at how some initial understanding of probability (and other basic analysis modules) allows us to understand the world: reliability theory (how likely is it that something will break down?), queuing theory (what is that chance of you waiting for a specific time period in a given queue at say, the supermarket, before you go to the till?) and so on.

Graduate classes then use the idea of partitions and breaking things down so much that it links nicely with the “rigorous” definition of probability: measure theory.

These are just some applications, the explanations are not detailed or interesting enough to explain why using partitions is so crucial in all various fields of mathematics.

The point: if you do not understand something, break it down into what you can (or will) understand. Then collect these little pieces together and see what you get.


An Introduction To Stochastic Calculus

October 30, 2012

When performing integration, differentiation, evaluating sums and products; our evaluations use numbers (usually real, sometimes complex). They define the “answer” we get; a physical interpretation into understanding what we have done.

Suppose instead of numbers, we use variables. Specifically, we use random variables. Then we are using a new calculus; appropriately named stochastic calculus, stochastic meaning random.

This post is not going into probability theory nor will it define what a random variable is, specifically we introduce stochastic calculus without any extreme formality.

Suppose you want to look at the stock market, or anything that enough humans touch in response of being able to gain something in return, say an environment. Say we define X(t) to be the price of asset X at time t, then consider some time period t + dt, we ask several questions:

How small does dt need to be?
What happens when it’s infinitesimal?

Does it relate to some specific probability distribution?
Can we make a formula for X without involving any explicit, complicated functions?

First we continue with our asset, given a change in time, we have a change in price dX(t).

Suppose the asset’s environment is responsive to change and there are many users like us, the controller of the asset; all looking after their asset. Is the change in the asset price simply just how the market fluctuates at the given time, multiplied by some constant for normalisation?

Suppose this is true and thus we have the differential equation dX(t)=\sigma* dW(t).

(where dX(t) is the change in price of the asset at time t, \sigma is a constant and dW(t) is some bizarre function that can understand and display the change in market fluctuations and how the market, and generally environment, responds to our asset.

Integrating and using some initial condition (X(0)=\alpha), we have a seemingly nice and a intuitive solution.

We have X(t) = \alpha + \sigma \int dW(t)

For short time periods (specifically, for really short) it works fine. But it’s hugely problematic.

As time increases, the probability of the asset having a negative price is bigger than zero, ie non zero, therefore it can happen on some day, or specifically some time period. Formally we have the statement “P(X(T)<;0) to be strictly bigger than zero" to be true.

Initially we think it’s not too probable but just that it exists gives us problems. Who’s going to trust or use some differential equation that says the probability of the asset having a negative price at some future time using that differential equation is non zero?

So we go back to our differential equation. Do we start over again? Not really, we just think sensibly. Suppose I have some hot tea that I want to drink it when it is just right. So I let it “play” in the environment and then make a judgement on whether I should drink it or not.

Relating to our problem, an investor or user of an asset will look at the potentials gain or loss (change) dX(t) as opposed to the initial asset price, this shows that we think about not only the price, but how the price changes in the future, despite trying to answer the future.

So we have the relative price change to be the proportional to market fluctuations, we have \frac{dX(t)}{X(t)}=\sigma * dW(t)

Now by integrating from the initial price to the variable t, this gives us X(t)=\alpha + \sigma * \int X(t) dW(t) .

Surely we can solve this, right? Trying our usual elementary methods makes no sense since our W(t) function is the function that looks at market fluctuations – in mathematical terms; nowhere differentiable and non continuous, so we arre stuck.

Unless we define a new integral equation and a new kind of calculus; the counterpart to real numbers, one for random variables. We have stochastic calculus.

Questions:

What happens when the time increments approach zero? (Brownian Motion)

With limiting zero time increments do we have a differentiable market fluctuation function? What probability distributions do we use to understand how this process works? (Poisson process)

What new type of integral do we define to solve the weird integral we have on the right hand side? (Ito integral)

This new integral must apply to some kind of calculus given that we will be differentiating it at some point and other proper tires, specifically what is this calculus and what special defining properties does it have? (Stochastic calculus)

Here we stop given that the questions asked cover enough of the motivation to study stochastic calculus, however for further enthusiasts we ask more questions and give some new ideas.

Given a discrete random variable mapped by some specific distribution, in between time intervals we can actually use a continuous random variable to “travel” through the time increments. The related distributions are the Poisson and Exponential (and discrete analogy Geometric). How can we use these distributions to understand what is happening in our asset price change, or in our differential equation?

This W(t) function, supposedly non differentiable everywhere, having a limiting time increment to zero may change how it behaves. It may even make solving the DE easier, but there are problems with it. What are these problems? Why do we care for small time increments?

Stochastic Calculus requires good knowledge of probability theory, real analysis, linear algebra and a few other bachelor level modules (complex analysis, differential equations, Hilbert spaces) to fully let it sink in.