Friday, October 10, 2025

On the Impact of AI/LLMs in Mathematics and the Role of Embeddings

 For a view on LLMs from the point of view of Mathematical research watch Alex Kontorovich's recent talk at Harvard CMSA, The Shape of Math to Come [2].

If you want to skip the beginning I may suggest jumping to min 33 [1]. But he does make a reference to earlier slides when describing what speedup he's looking for when it comes to using LLMs to generate new mathematical results.

Tl;dr: He envisions a future of math hinging on two components: first formalization in a language like Lean or similar (Agda, Coq,...) and second an AI component atop of the former. In this case he only has in mind a generative one, namely an LLM (compare this to Lecunn's Dino talk [3]). At min33 he starts conveying the main idea of where he wants to get: what  matters is the idea we want to express, not its formalization, except that we need a system or process ensuring the latter actually conveys the former correctly. Even if an LLM system like AlphaProof could generate new mathematical results (papers) that are 99% of the cases fully correct and true, it is not useful for advancing Mathematics because of that probabilistic nature of its correctness. But LLMs could help in the initial step of translating ideas into Lean and then finding a proof, which, by the typed nature of Lean, if it "compiles", it is correct. Therein could lie the real speed up in Mathematical research, where researches would be able to spend more time in shaping their ideas, instead of having to dedicate quite a significant effort in proofreading their own derivations.

I wonder if one could not draw a direct analogy between the role of that formalization step in Lean and the universal underlying representation, in AI parlance, embedding step. In the following text I will be using interchangeably embedding, representation, description and up to some extend formalization too.

Here, formalization is meant as a graded concept: the same math result expressed in Lean would be 100% formalized, while in Natural language in would be anywhere between 0% and say 1% -the point being it's upper bounded. 

In this sense is that I would say describing some objects as vectors using an array of integers is in itself a formalization level, at the same time as an embedding. But you could be describing your vectors in a synthetic way, as geometrical objects -the arrows. Or you might be able to dispense altogether of the idea of vectors and describe your problem in a purely diagrammatic way. Mind you, all these descriptions belong to a well formalized area of Mathematics (Linear Algebra, Synthetic Geometry or say Clifford Algebra or Geometric Algebra, and (many) diagrammatic descriptions can be put within Category Theory. 

Each description/representation of your world will have different computational advantages. Hence, using different formal math tools (say from different math fields) could make a huge improvement in the results or usefulness of any given ML/AI tool. Maybe these AI tools have each different representations (descriptions, embeddings, formalizations) where their use is optimal. 

So,  while say the area of Reinforced Learning (RL) is currently stuck, that could change if we find a better description suitable for it. The same would apply in a practical problem you may be stuck trying to use some ML/DL tool to solve it, but changing your world description may lead you efficiently to its solution. 

The latter is in fact a "universal" trick not only in Math or Physics: change the way you look at things and they could go from being a roadblock to boosting you forward.


Anyway, it's highly possible that all this is but an incoherent rambling under the influence of a little bug I'm currently evicting. 


Note: In Kontorovich video, image and sound are out of sync by far; however, sound and subtitles are in sync.

[1] min33: https://youtu.be/xIG1RI44Nog?t=2002[2] full: https://www.youtube.com/watch?v=xIG1RI44Nog

[3] Lecunn's talk on Dino3: https://www.youtube.com/watch?v=yUmDRxV0krg 


PS: Talking about different descriptions, Youtube just fed to me this other recent video on "Towards a Geometric Theory of Deap Learning" by Govind Menon. Maybe worth having a closer look another time.


PS2: The Internet gods pointed me to another video that would perfectly match as a second part of Kontorovich's one: "Where is mathematics going (September 24, 2025)" by Kevin Buzzard, sponsored by the Simons foundation.

Monday, October 6, 2025

On "The Bitter Lesson" by Richard Sutton

Just found an interview with Richard Sutton referencing his "Bitter Lesson" post from 2019. See link at the bottom.

He claims to be making two points. I think his first one is a useful view for considering what AI topic might be more fruitful in the long run. Here long run is meant as on a personal level, not on a long run of Science. Read further along.


The successful approaches would be based fundamentally only on "search" and "learning". These are what often are critized as "simple brute-force approaches". 


This approach has been the one leading to the current wide-spread adoption and surprising effectiveness of RL -exemplified by solutions like AlphaGo, AlphaZero-, DeepLearning and LLMs.


And this success should be contrasted to the comparatively failed attempts at implementing techniques bearing what our own understanding of how we solve the same tasks or how we learn is.


In summary, trying to come up with techniques based on our previous world models fared way worse than techniques based on simple, general brute-force methods.


As he mentions in that interview, these claims are meant but as a reflection of what has or not worked out in the past 70 or so years. He makes no claim this will be the case in the future.


His second claim consists of his very last paragraph. However, I'm not certain if his point here isn't just a rephrasing of the previous one. 


Neither am I convinced he is consistent: One might argue that what he thinks of "search & learning " are but idealizations, i.e., models of methods created  by our minds. In particular, I think that what he considers as "learning" is simply RL, reinforced learning. 


But this method is nothing but human knowledge that we would be putting into the AI, which is the approach he seems to be arguing against in this second point.


His only argument around this seems to be that he considers these two processes as simple enough for serving as guiding principles. They do fit with the view of his first point though.


References

https://en.wikipedia.org/wiki/Richard_S._Sutton 

http://www.incompleteideas.net/IncIdeas/BitterLesson.html