Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
2000 Years of Matrix Multiplication (st-andrews.ac.uk)
201 points by nothrowaways on Feb 4, 2023 | hide | past | favorite | 31 comments


Several methods of solving linear equations of several unknowns are described in Hindu literature, dating from about 200CE onwards. In particular, Brahmagupta's method seems very similar to Gaussian elimination[1].

And then there are solutions to quadratics, cubics and biquadratics, pell's equations, etc.

1. https://archive.org/details/history-of-hindu-mathematics-2-b...


It’s interesting and cool to see how long it took, and how many people it took, for the abstraction of matrices and matrix algebra to develop (in addition to how well-traveled the road was before the people we’ve attributed to and named some of these techniques after). What this makes me wonder is: what abstractions might be in progress today but incomplete? Maybe it’s a fun thought experiment and there’s no way to know because the ideas haven’t occurred to anyone, or maybe some people already do have an idea but won’t ultimately be remembered for it when our progeny is able to explore it a little more deeply and explain it a little more clearly. Or, I don’t know, maybe the period of history where some things get forgotten is now over?


> What this makes me wonder is: what abstractions might be in progress today but incomplete?

Category Theory, especially how it relates to programming language and API design.

We’re building pyramids like the Egyptians did, and we’re impressed without ourselves and our grand achievements just like they were.

The Egyptians built a wonder of the world before the invention of the wheel!

That’s what programming feels line to me. We’re well paid labourers chiselling away at stone with crude tools. We’re convinced ourselves that we live in the future because we have copper tools instead of just wood and stone.

The most amazing thing about the computer revolution is that it hasn’t happened yet.


The Egyptians built a wonder of the world before the invention of the wheel!

The wheel predates the Great Pyramid at Giza by perhaps a thousand years.


AFAIK they didn't use them much for the construction of the pyramids, the stones were dragged over earth that they sprinkled with water for lubrication.

Which just reinforces my point.

In the field, I've found that the majority of developers never use a debugger, the type built into modern IDEs.

Even back in the year 2000 that surprised me, but now? It's like watching people shovelling dirt by hand while standing next to a hydraulic digger.


Debuggers aren't really that useful for debugging complex systems with a lot of moving parts, e.g threaded code or code ith actors/agents or opaque references/layers or edge cases in hot code in particular. It also doesn't work terribly well if reproductions are timing sensitive or requires non local environment.

Time travel debuggers - which essentially record every stack frame and dump either continuously or the last 1-2ms before some trigger is quite a bit better but that's a lot closer to logging everything than typical debuggers.


> The Egyptians built a wonder of the world before the invention of the wheel

You might find this link interesting

https://journals.uair.arizona.edu/index.php/jaei/article/vie...


If you have not, you may be interested in reading Thomas Kuhn's book -- The Structure of Scientific Revolutions.


Graphs can be abstracted as matrices

It is often forgotten, but can make some algorithms faster


Pretty mind boggling that there were Babylonians solving simultaneous equations, and people knew Gaussian elimination in 3rd century BC China. I wonder if the simultaneous equations were actually used to guide any decisions made by Babylonian government


There were Sumerians solving differential equations at least 3700 years ago. Irving Finkel has a wonderful book about this: "The Ark Before Noah".


Difference equations, not differential equations.


Is this the part in the book where they calculate the shape?


Determinants always fascinated me, partly (completely?) because of the cool sounding name. I always wanted to get intuition for them because their formula is so simple, which has always been deeply unsatisfying to me to just take on its face.

This article got me searching for a 3blue1brown video on determinants and now my mind is absolutely blown!

https://youtu.be/Ip3X9LOh2dk


It's unfortunate that more theory makes determinants really intuitive, but most people don't get there. The upshot is that there's a thing called the exterior algebra which is relatively simple to define and calculate with, and essentially encodes the notions of areas, volumes, 4-volumes, etc. over a space. Every linear map on a space uniquely specifies a map on the exterior algebra (in fact this is a functor, which makes calculations easier), and the determinant of a linear map is then just the corresponding exterior algebra map evaluated on the unit n-volume. The minors are the exterior algebra map evaluated on the unit volume for various subspaces.

The rules for the exterior algebra and the fact that this is a functor let you learn a couple simple rules to simplify expressions into a standard form, and then you just apply those rules mechanically and don't have to remember minus signs or what multiplies with what. It becomes a process that requires no thought.

It's sort of like how once you learn how to deal with complex exponentials, you can forget pretty much every rule from trigonometry, making it entirely pointless to memorize those rules.


It's unfortunate that lower level "math" is so banally taught (i.e. calc and below), at least in the US.


One of the great parts about linear algebra is that there is almost always a simple geometric idea underneath.

The geometric picture underneath is one of the things that keeps me in awe of the subject despite its seeming simplicity, and I keep getting something out of it every time I come back to it. It's a bummer since finite-dimensional linear algebra is one of a handful of mathematics topics where one can answer all the questions posed at the beginning of a course in it by the end of a course in it, so it is a pretty self-contained topic.

After learning e.g. exterior algebra (differential forms), Clifford algebra (geometric algebra in these parts), and so on, the geometric picture of the determinant as the size of an oriented volume makes deriving the algebraic formula super duper slick. Like in Clifford algebra, the formula can be proven in two or three lines. It's unfortunate that it seems like e.g. exterior algebra never get introduced sooner in the pedagogy of linear algebra or multivariable calculus because when used right they make the underlying ideas shine through beautifully. It's a bummer since exterior algebra is much simpler than it looks, though like many things in mathematics, it's takes a lot of work to make that simple idea rigorous. But unfortunately algebra in general given it's abstract nature can absolutely lobotomize the real deal geometric ideas underneath a lot of this stuff when used poorly.


The trifecta of:

  * There's almost always a simple geometric intuition, and low-dimensional intuition can get you quite far even in high dimensional cases.
  * You can surprisingly often get by with closing your eyes and saying "my problem is linear" three times. See: All of neural networks.
  * Linear problems have practically all nice properties you could ever ask of any function.
Has made linear algebra by far the most bang/buck mathematics topic I've studied in my life. Close behind is asymptotic analysis.


A shocking number of people are apparently taught the determinant without learning the geometric interpretation. It's educational malpractice.


watch his fourier trick. you're welcome



[dead]


> the emergence of the beta rewrite in lambda calculus from the shuffle rewrite and relations to the commutativity of the addition of vectors in the tangent space of a manifold (https://mbuliga.github.io/novo/presentation.html)

piques my interest; can you recommend any presentations of this work that might be suitable to someone without a background in knot theory?


Hi thanks for the interest, for the pure algebraic part a good entry is this (and references therein) https://arxiv.org/abs/2110.08178

The emergence of the beta rewrite is explained if you try to decorate the graphs from the first figure in the "Emergent rewrites", according to the dictionary in the first section, then you pass to the limit.

To see why is that a beta rewrite you have to go to the parent page of Pure See and look for the first article listed there.


I have a stupid question. I still don't really fully understand why matrix multiplication came to be row by column. I mean you could define things as row by together with the transpose operator right? So why is row by column so obviously the right way to define multiplication?


I'm bad at math, and know some linear algebra only because it's useful in cryptanalysis, but the way I keep this in my head is that matrix multiplication, mechanically, is just repeated application of matrix-vector multiplication; you're just doing multiple columns, not just one. And then, matrix-column vector multiplication is linear function application, right?

And you can do column-times-row too, right? It's something like the sum of multiple outer products?

(These question marks are all genuine! I probably have most of this wrong!)


It's probably related to how we read and write. Top down, left to right. Rows are the entries, columns are attributes. Your list will potentially grow forever, whereas the attributes are typically fixed. Your width is typically fixed, but you can continue writing top down and continue on a new sheet/etc, as you add more rows. It just makes sense with how we typically organize information on pen/paper.


In the row picture, the elements of a matrix are the coefficients of linear functions acting on its domain. In the column picture they are coordinates in its image.

Any matrix A or B can be interpreted from either point of view on its own. When you take their product AB, each of A's functions (row picture) is evaluated on each of B's points (column picture).

This gives an associative (but not commutative) algebra. If you go around the column picture with an operator like A.B=AB^T, you get

    (A.B).C = (AB^T).C
            = AB^TC^T
    

    A.(B.C) = A.(BC^T)
            = A(BC^T)^T
            = ACB^T
The two formulas are not equal, and second involves "traditional" matrix multiplication. You can compute products like this operationally though, as long as you work from left to right.


Vectors are written as columns. If I have a linear function, I can describe it to you by just telling you what it does to each basis element, so that's what a matrix is: a list of each of the output (column) vectors you get when you apply it to each basis vectors as input. Multiplication is actually function composition: if A is the matrix for g and B is the matrix for f, then AB is the matrix for the function `x -> g(f(x))`.


The way things are you have a nice set of properties which allow you to manipulate system of matrix equations - but it's just notation and convention. You could do column by row as well - it'd be all the same math just effectively transposed. If you defined matrix multiplication as row times row then you are either limited to square matrices or you loose some of the convenient properties of matrix multiplication

ABC = A(BC) = (AB)C

(AB)^T = B^T A^T

and most importantly AB =/= BA


It's a dark pattern to make make Matrix math seem more complicated than it is.


They didn’t talk about matrix exponentiation, which appears all over system and control theory and quantum mechanics. I wrote them about this just now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: