What is the shape of data? Not a trivial problems, we can use some tools:

  1. Geometric simplicial complexes: It is the joining of points, segments and figures in any dimensions.
  2. Homology groups: Algebraic tool that describe holes in a simplicial complex.
  3. The natural pseudo-distance: is used to compare data under the action of a group.

Persistence Diagrams

Imagine you have a function between (0,1) and you have a discretization of the interval.
And you have a function from this interval to itself.



etc.

The lines are there for the explanation (between points, since it’s discretized).

The question is what is the shape of the function. How to represent it’s shape mathematically?

We take a level and we consider the sub-level set which is the set of points where the function takes a value that it’s less than .
The idea is counting the number of components when changing , going from to .

We take a low and we increase it’s value until there is the birth of a component, it enters the sub-level set. Each point it’s described by it’s time of birth, on the left of the pair, and the time of death, on the right, so we have:



The construction of this list happens like this:

  1. at we have no components;
  2. at a component join, the one with . So we add a to the list;
  3. at a second component joins we add a to the list;
  4. at a new component, the one with joins the collection, so we write in the list;
  5. at the two points, the one that joined at and become contiguous, so by definition the younger one dies, and we add the of death to by writing in the list ;
  6. at the first point, that joined at and the rest of the joined points, represented by the oldest one, the one that joined at are contiguous, so the youngest one dies and we add the of death by writing to the list.
  7. Finally, all the function is contiguous and changing will do nothing, so we add the time of death to the remaining components by writing .

Even if this was not discretized, or you amplify it, the collection of points written on the right doesn’t change, this points are the persistence diagram.

The persistence diagram contains only six numbers. You lose a lot of information, you can draw an infinite number of function with the same persistence diagram (an example in the picture is the function in red, which has the same diagrams as the one in black). So we lose a lot of information, but for a lot of tasks this information is sufficient, for example in seeing the similarity of the shape.

GENEOs

Group Equivariant Non-Expansive Operators.
The computation of Persistence Diagram is a trivial case of GENEOs.

Blurring and the group of isometrics (rotation) of an image are GENEO, you can apply the two transformation (blurring and isometrics of rotation) in any order and the result is the same. So the two transformation are equivariant. Basically, applying these two operators in any order does not change the distance of the results.

Geometric Simplicial Complexes

Definition of Affine Combination. The affine combination of the vectors with coefficients :

Under the assumption that

If I have two points and we can say that which means that the linear combination can be written as , with .

We consider the real lines that unites the two points.

If we do the same with three points

With:

We are considering the plane contained by the three points.

The set of any affine combination of the points you get the affine hull of .

We are considering point in general position. But let’s consider taking three points belonging on the same line (not general position).
In the general case the affine hull of three points is a plane, but in this case is a line (but can also be a point if they are all in the same position).

To free the definition from this case, we assume that our points are affinely independent.

Definition of affinely independence.

and

Then:

The points are affinely independent if and only if the vectors are linearly independent.

You can just consider the vectors that connect each point to another are linearly independent.

You get the same result if you consider another starting point for the vectors.

Convex Set

Considering a subset of a vector space, we say it’s a convex set if every point inside it are connected by a segment contained in the subset.

Definition of Convex Set. A subset is convex if we have that .

Compact Set

In compact sets are set that have the following properties:

  1. They are close. A set is closed if it contains its border. So in is closed because it contains bot its boundaries, and , while its open.
  2. They are bounded. A set is bounded if there exists a real number such that the distance between any points in is less than . A.k.a. if can be contained in a sphere of finite radius.

Each compact set has this property: if is an infinite family of open subsets of such that , then a finite subfamily exists, such that .

So, basically, if a infinite family of open subsets of contains then exists a finite subfamily of that family that also contains .

Convex combination

Definition of Convex combination Convex combination of is an affine combination where all coefficient are non-negative.

Affine combination means

To have both non-zero, it meas that

We are considering just the points between the two and .

If we consider three points in general position we get a triangle.

If we consider four points in general position we get a tetrahedron.

Simplex

Definition of a k-Simplex. A k-Simplex is the convex hull of k+1 points that are affinely independent.

a 0-Simplex is a point, a 1-Simplex is a line, and so on.
The empty set is the unique Simplex in dimension -1.

Simplicial Complex

Definition of Geometrical Simplicial Complex. A geometrical Simplicial Complex is a collection of simplexes , verifying these two properties:

*.

For each simplex we have many faces, in the triangle we have 3 face of 1 dimension (the segments) and three faces of 0 dimensions (the points) and the empty set as the face in the dimension -1.
The first property requires that the collection of simplex must verify that if belong to K and tau is a face of sigma, than tau must belong to K. If you put a simplex in the collection you also have to put all the faces.

and is both a face of and a face of , so