Terence Tao: At the Erdos problem website, AI assistance now becoming routine

287 points by dwohnitmok 2 days ago

Having the ability to throw math heavy ML papers at the assistants and get simplified explanations / pseudocode back is absolutely amazing, as someone who's forgot most of what I learned in uni, 25+ years back and never really used it since.

tyre a day ago

This is where LLMs shine for learning imo: throwing a paper in Claude and getting an overview then being able to ask questions.
Especially for fields that I didn’t study at the Bachelors or Masters level, like biology. Getting to engage with deeper research with a knowledgeable tutor assistant has enabled me to go deeper than I otherwise could.
- fastasucan 15 hours ago
  
  How do you know its correct? And how do you learn to engage with the theory heavy subject doing it this way?
  - NuclearPM 4 hours ago
    
    Ask for sources. Easy.
- devin 12 hours ago
  
  If you did not study these topics, the chances are good you do not know what questions to even ask, let alone how to ask them. Add to the fact that you don't even know if the original summary is accurate.
  - tyre 2 hours ago
    
    The original summary is the paper’s abstract, which I read. The questions I ask are what I don’t understand or am curious about. Chances are 100% that I know what these are!
    I’m not trying to master these subjects for any practical purpose. It’s curiosity and learning.
    It’s not the same as taking a class; not worse either. It’s a different type of learning for specific situations.
  - kurthr 11 hours ago
    
    Asking the right questions (in the right language) was important before and it's even more important with LLMs, if you want to get any real leverage out of them.
- paulryanrogers a day ago
  
  Isn't there a risk that you're engaging with an inaccurate summarization? At some point inaccurate information is worse than no information.
  Perhaps in low stakes situations it could at least guarantee some entertainment value. Though I worry that folks will get into high stakes situations without the tools to distinguish facts from smoothly worded slop.
  - augment_me 21 hours ago
    
    There is, but there is an equal risk if you were to engage about any topic with any teacher you know. Everyone has a bias, and as long as you dont base your worldview and decisions fully on one output you will be fine.
    
    SetTheorist 12 hours ago
    
    Experimenting with LLMs, I've had examples like it providing the Cantor Set (a totally disconnected topological space) as an example of a Continuum immediately after it provides the (correct) definition as a non-empty compact, connected (Hausdorff) topological space. This is immediately obvious as nonsense if you understand the topic, but if one was attempting to learn from this, it could be very confusing and misleading. No human teacher would do this.
    
    tyre 2 hours ago
    
    I don’t know what any of this means!
    But I’m not trying to become an expert in these subjects. If I were, this isn’t the tool I’d use in isolation (which I don’t for these cases anyway.)
    Part of reading, questioning, interpreting, and thinking about these things is (a) defining concepts I don’t understand and (b) digging into the levels beneath what I might.
    It doesn’t have to be 100% correct to understand the shape and implications of a given study. And I don’t leave any of these interactions thinking, “ah, now I am an expert!”
    Even if it were perfectly correct, neither my memory nor understanding is. That’s fine. If I continue to engage with the topic, I’ll make connections and notice inconsistencies. Or I won’t! Which is also fine. It’s right enough to be net (incredibly) useful compared to what I had before.
    
    fastasucan 15 hours ago
    
    >but there is an equal risk if you were to engage about any topic with any teacher you know.
    No it isnt.
    
    HDThoreaun 12 hours ago
    
    I’ve used LLMs to summarize hundreds of papers. Theyve been more accurate than any teacher I’ve known. Summarizing text is one of their best skills.
    
    thoroughburro 15 hours ago
    
    It’s my experience that humans are far, far, far more trustworthy about their limitations than LLMs. Obviously, this varies by human.
    
    tipperjones 11 hours ago
    
    It’s only equal if you consider two outcomes: some risk and no risk.
    And there’s always some risk.
    
    cma 17 hours ago
    
    Are you just saying that broadly, e.g. original 2022 chatgpt was also an equal risk if you use this way?
    You won't be able to verify everything taught from first principles, so do have to at some point give different sources different credibility I think.
  - paladin314159 a day ago
    
    I've been doing this a fair amount recently, and way I manage it is: first, give the LLM the PDF and ask it to summarize + provide high-level reading points. Then read the paper with that context to verify details, and while doing so, ask the LLM follow-up questions (very helpful for topics I'm less familiar with). Typically, everything is either directly in the original paper or verifiable on the internet, so if something feels off then I'll dig into it. Through the course of ~20 papers, I've run into one or two erroneous statements made by the LLM.
    To your point, it would be easy to accidentally accept things as true (especially the more subjective "why" things), but the hit rate is good enough that I'm still getting tons of value through this approach. With respect to mistakes, it's honestly not that different from learning something wrong from a friend or a teacher, which, frankly, happens all the time. So it pretty much comes down to the individual person's skepticism and desire for deep understanding, which usually will reveal such falsehoods.
  - tovej a day ago
    
    Yes. I usually test AI assistants by giving them my own work to summarize, and have nearly always found errors in their interpretation of the work.
    The texts have to be short and high-level for the assistants to have any chance of accurately explaining them.
    
    slow_typist 21 hours ago
    
    I can probably process anything short and highlevel by myself in a reasonable time, and if I can’t, I will know, while the LLM will always simulate perfect understanding.
  - anon291 a day ago
    
    There is, but just ask it to cite the foundational material. A huge issue with reading papers in topics you don't know about is that you lack the prerequisite knowledge and without a professor in that field, it may be difficult to really build that. Chat GPT is a huge productivity boost. Just ask it to cite references and read those.
- fragmede a day ago
  
  I'm not sure the exact dollar value of feeling safe enough to ask really stupid questions that I should already know the answer to and I'd be embarrassed if anyone saw me ask Claude, but it's more than I'm paying them. Maybe that's the enshittification play. Extra $20/month if you don't want it to sound judgey about your shit.
sesm 16 hours ago

How do you verify that the explanation is accurate? Mathematical definitions can be very subtle.
- trauco 15 hours ago
  
  The answer is you put the top mathematician in the world to do it, easy peasy.
  “The argument used some p-adic algebraic number theory which was overkill for this problem. I then spent about half an hour converting the proof by hand into a more elementary proof, which I presented on the site.”
  What’s the exchange rate for 30 minutes of Tao’s brain time in regular researcher’s time? 90 days? A year?
  - gjm11 7 hours ago
    
    For that sort of task: no, Tao isn't all that much better than a "regular researcher" at relatively easy work. But the tougher the problems you set them at, the more advantage Tao will have.
    ... But mathematics gets very specialized, and if it's a problem in a field the other guy is familiar with and Tao isn't, they'll outperform Tao unless it's a tough enough problem that Tao takes the time to learn a new field for it, in which case maybe he'll win after all through sheer brainpower.
    Yes, Tao is very very smart, but it's not like he's 100x better at everything than every other mathematician.
codemac a day ago

Math notation is high context, so it's great to just ask llm's to print out the low context version in something like lisp where I can read and decompose it quickly.

adidoit a day ago

I hope we continue to see gains for scientific professionals and companies doing research.

Even imperfect assistants increase leverage.

gsf_emergency_6 a day ago

iOS formalization app mentioned by Tao (beta)
https://aristotle.harmonic.fun/
TIL: startup founded by Robin Hood CEO

kregasaurusrex 2 days ago

'Vibe formalizing' is a logical extension of 'vibe engineering' implemented by 'vibe coding'. Sometimes I have trouble with getting the individual puzzle pieces of a problem to fall into place, where a hypothetical 'Move 37 As A Service' to unify informal methods with mathematical rigor deserves to be explored!

anon291 a day ago

I had put on the back burner some polyhedral compilation papers. I had read all the material, but some key questions still meant it was not possible for me to implement it. In particular, I was looking at barvinoks counting algorithm and did not understand why you needed to expand the polynomials in a pointed cone. However, chatgpt correctly led me through the reasoning. Could it have made a mistake? Of course. And it did. However, since my confusion meant that I was also wrong, bouncing the idea back and forth was really useful. Plus the ai bots are better at understanding your own particular points of confusion.

gerdesj a day ago

Erdős

I was told by a hungarian, that hungarian written spelling and spoken pronunciation is pretty precisely aligned compared to, say, english. Except when it comes to names when it gets a bit random!

Why not do the bloke the decency to spell his name correctly? Those diacritics are important.

Anyway, I was told that Paul's name is very roughly pronounced by an anglophone as: "airdish".

mjcohen a day ago

(I saw this on a math department bulletin board about 1960)
A theorem both deep and profound States that every circle is round But in a paper by Erdos Written in Kurdish A counterexample is found
perching_aix 10 hours ago

Ő is just œ (oe), nothing crazy. Certainly not a scenario that would belong to the quirky category.
The only weird ones I can think of are the ones that end in -y. For example, Görgey. They're meant to be -i endings. They signify a noble lineage (or at least used to).
I guess "ch" might also show up every now and then too (it's just "cs", just like "ch" in English). For example, Széchényi.
Since this is a compsci forum to some extent, maybe I should also mention that the so-called Lanczos-interpolation is "actually" Lánczos. Took even me a while to pick up on that one! Thinking about it, I now see that it features a "cz", another letter (digraph) that is longer part of the alphabet.
Also note that Paul is a "translated" name. His actual name was Pál Erdős. He got lucky with that one, it's an easy swap. Edward Teller (Ede Teller) was the same way, and so was John (von) Neumann (János Neumann).
As a bonus trivia, the Hungarian name order is big endian, like the Japanese. So it would be "Erdős Pál", "Teller Ede", "Neumann János", and "Lánczos Kornél". Though just like with Japanese, I would not recommend trying to adhere to this order in most English speaking contexts.
zajio1am 12 hours ago

Diacritics are language-dependent, so using hungarian-meaning diacritics into english text makes no sense.
somenameforme 20 hours ago

I take it that that's a palatalized ending? I read your comment at first and was like "airdish" wtf? Then I palatalized the 'os' ending and realized oh yeah... that does sound kind of like airdish!
yeasku a day ago

Is not American so nobody here cares...
renewiltord a day ago

Irrelevant. Cf. Diogenes on death
- perching_aix 9 hours ago
  
  Sounds like nihilism for beginners.
umanwizard a day ago

I could be wrong, but FWIW I doubt Hungarians include diacritics that don’t exist in Hungarian (like ñ) when writing foreign names.
- gf000 20 hours ago
  
  Depends. There are names that are "romanized" to Hungarian pronunciation rules, like Dosztojevszkij (Dostoevsky), or Kolumbusz Kristóf (Cristoforo Colombo - Hungarian puts the family name first), though it is no longer the practice, it's mostly used for historic names only. That is, Trump is written like that, and not as we would pronounce (something like "Trámp")
  In general, if the source language has a latin alphabet, we try to stick to the original spelling in most cases, but it is not uncommon to replace non-Hungarian letters with the closest one. It's a bit more complicated in case of non-latin alphabets, especially Cyrillic due to a lot of shared history.
  - umanwizard 14 hours ago
    
    How would you write e.g. the Spanish surname Yáñez ?
    
    gf000 14 hours ago
    
    Unless it's a famous person who lived several centuries ago, probably would leave it as Yáñez, or if I (or more realistically, an average, not too technical user) were to not have an easy way to input the special characters, then as Yanez. Probably not as Yánez, even though we do have the letter 'á', but leaving it as is may be more misleading in terms of pronunciation then the non-accented version.
    
    umanwizard 3 hours ago
    
    I think that's roughly similar to how it's done in American English (though Americans are probably less likely to have any idea how to type accented letters). A serious publication would write Yáñez but an average person would be pretty likely to just type "Yanez". Anyway, my point was just that it's not too weird to see someone write "Erdos" if their native language doesn't have the ő character.
- hgal a day ago
  
  You are wrong.

RossBencina 2 days ago

Also interesting that the responses include anti-Lean material.

orochimaaru a day ago

I'm not a mathematician, but how credible is that anti-Lean material? Are they marketing an alternative programmatic approach, as in they're anti-lean because "I got something else" or are they philosophically anti-Lean and have valid arguments?
- dwohnitmok a day ago
  
  It's mainly the latter, although the author makes half-hearted gestures as some sort of CAS (Computer Algebra System) being better.
  It's not very credible. There are individual fragments that make sense but it's not consistent when taken together.
  For example, by making reference to Godelian problems and his overall mistrust of infinitary structures, he's implicitly endorsing ultrafinitism (not just finitism because e.g. PRA which is the usual theory for finitary proofs also falls prey to Godel's incompleteness theorems). But this is inconsistent with his expressed support for CASes, which very happily manipulate structures that are meant to be infinitary.
  He tries to justify this on the grounds that CASes only perform a finite number of symbol manipulations to arrive at an answer, but so too is true for Lean, otherwise typechecking would never terminate. Indeed this is true of any formal system you could run on a computer.
  Leaving aside his inconsistent set of arguments for CAS over Lean (and there isn't really a strong distinction between the two honestly; you could argue that Lean and other dependently types proof assistants are just another kind of CAS), his implicit support of ultrafinitism already would require a huge amount of work to make applicable to a computer system. There isn't a consensus on the logical foundations of ultrafinitism yet so building out a proof checker that satisfies ultrafinitistic demands isn't even really well-defined and requires a huge amount of theory crafting.
  And just for clarity, finitism is the notion that unboundedness is okay but actual infinities are suspect. E.g. it's okay to say "there are an infinite number of natural numbers" which is understood to be shorthand for "there is no bound on natural numbers" but it's not okay to treat the infinitary object N of all natural numbers as a real thing. So e.g. some finitists are okay with PA over PRA.
  On the other hand ultrafinitists deny unboundedness and say that sufficiently large natural numbers simply do not exist (most commonly the operationalization of this is that the question of whether a number exists or not is a matter of computation that scales with the size of the number, if the computation has not completed we cannot have confidence the number exists, and hence sufficiently large numbers for which the relevant computations have not been completed do not exist). This means e.g. quantification or statements of the form "for all natural numbers..." are very dangerous and there's not a complete consensus yet on the correct formalization of this from an ultrafinitistic point of view (or whether such statements would ever be considered coherent).
  - xtal_freq 19 hours ago
    
    I wonder what the ultrafinist argument against theorems about the natural numbers as defined in Coq would be.
    
    dwohnitmok 12 hours ago
    
    They would say the theorems are meaningless.
    The classical mathematician would respond that the theorems are clearly meaningful and you can easily test them against any natural numbers you care about to see empirically they are meaningful.
    The ultrafinitist would respond that they are only coincidentally correct, in the same way that pre-modern mathematical reasoning was often very sloppy, featured regular abuse of notation, and had no coherent foundations, but nonetheless still often arrived at correct conclusions by "coincidence."
    The classical mathematician might then go over how strong the intuition of something like "there exists a number that..." is and how it is an easily empirically validated statement...
    And so the debate would keep going.
- lakecresva a day ago
  
  It seems pretty obviously machine generated, and the appearance of the word "metaphysics" in the output suggests the prompt author didn't know what they were talking about to begin with.
  - aleph_minus_one 15 hours ago
    
    > the appearance of the word "metaphysics" in the output suggests the prompt author didn't know what they were talking about to begin with.
    Or the author is not natively fluent in English.
CamperBob2 2 days ago

Due to his position and general fame, Tao has to deal with a larger-than-usual number of kooks.
- aswegs8 16 hours ago
  
  Even on Mathstodon, where there are on average 1 replies and 1 likes per comment
  - pretzellogician 11 hours ago
    
    Totally true! But also, there users like Tao don't feed trolls, and everyone has the option to block them.
testartr a day ago

[flagged]
- DroneBetter a day ago
  
  I think Zeilberger is taken heavily out of context and confused with Norman Wildberger a lot; he certainly has some eccentric opinions but that one is not at all reflected in his blog's contents (which are largely things like "[particular paper] presents [conjecture/proof] that can be [resolved/shortened] by routine methods" that are only routine because of his decades of work), it's a shame that him being the go-to example of a crank seems to have become engrained into LLMs

WhyOhWhyQ a day ago

I've had mixed results with AI on research mathematics. I've gotten it to auto-complete non-trivial arguments, and I've found some domains where it seems hopelessly lost. I think we're still at a point in history where mathematicians will not be replaced by AI and can only benefit by dabbling with it.

godelski a day ago

I've had similar results in both mathematics and programming. For one paper I was writing I wanted to try them and it was a fairly straightforward problem counting the number of permutations. I spent much more time trying to get the AI to figure it out than it took to actually solve it. Couldn't get it to do it even after I solved the problem. Similarly in coding I've had it fail to find trivial bugs like an undefined keyword, which would have easily been caught had I turned on ctags (because the keyword was inherited from a class, which made it so easy to miss). But similarly I've had them succeed on non-trivial tasks and those have been of great help, though nothing has ever been end-to-end just the AI.
So I agree. I think these tools are very helpful, but I also think it is unhelpful that people are trying to sell them as much more than they are. The over hype of them not only legitimizes any pushback but provides ammunition. I believe the truth is that they're helpful tools, but are still far from replacing expertise. I believe that if we try to oversell them then we run the risk of ruining their utility. If you promise the moon you have the deliver the moon. That's different from aiming for the moon and landing in the trees.

fsniper 16 hours ago

I still can't believe that we are in the era of 'star trek' like "Computer plot me a proof for this math problem" in my life time! Wish we could also do the same for "Beam me up scotty"

niek_pas 15 hours ago

> Wish we could also do the same for "Beam me up scotty"
You might die every time you do, though, so maybe not.
- bluedel 15 hours ago
  
  For some definitions of "you" and "die".
- fsniper 15 hours ago
  
  that also brings the philosophycal question of will I be the same if all my atoms and molecules are copied exactly the same?
  - aleph_minus_one 15 hours ago
    
    Thinking about such questions before we are capable of doing such an experiment at least with small animals is like discussing how many angels can stand on the point of a pin.
    
    MaxBarraclough 9 hours ago
    
    Inventing a Star Trek-style teleporter would be quite something, but I don't see how it would advance the philosophy in any way. We already know the teleportation subjects would report 'feeling just the same' as before. If they didn't then by definition it's not a functioning teleporter, as it accidentally modified the subject in transit.
    
    fsniper 14 hours ago
    
    I am not sure philosophy interests itself about if we can do it yet or not.

anon291 a day ago

I was driving around tonight doing errands and while I was doing so, had a great conversation with chatgpt as to the intimate details of the llvm and GCC pipeline schedulers. It's a huge productivity boost. It has now taken notes for me for some compiler stuff I'm experimenting with. This would previously have been impossible.

egl2020 a day ago

Based on my experience, I would expect that the LLM was wrong about some of those details. Of course, your mileage (see what I did there?) may vary.
- anon291 7 hours ago
  
  Of course it was. No wronger than I'd have been had I started looking into it without its help. On the other hand reading code while driving is terribly dangerous while hands free chatgpt is easy. Moreover I prefer talking for some things.

xhkkffbf a day ago

They should name one of the AI's "Erdos". Then we can all have an Erdos number of one!

layer8 a day ago

Or a successor of https://en.wikipedia.org/wiki/DR-DOS.
hatmatrix a day ago

There is an AI-integrated IDE called Erdos...
https://www.lotas.ai/erdos

bgwalter 2 days ago

[flagged]

lanstin a day ago

Because he is smart enough to use the existing (frontier) tools to get good results and create a sort of collaborative environment that is novel for research maths.
- lanstin a day ago
  
  As for collaboration I meant: https://terrytao.wordpress.com/2024/09/25/a-pilot-project-in... The issue with horizontal scaling of maths research is trust: if you don't know the author, it is more work to verify their work, especially non-formal proofs. Lean4 enables large projects be split up into pieces where lean can validate each intermediate result, so a much broader group of people can contribute pieces without jeopardizing the overall soundness.
- sd9 a day ago
  
  Indeed. Who’s to say whether Wiles would have used AI assistance if it had, you know, existed, in 1994.
  - CamperBob2 a day ago
    
    Wiles's initial presentation of the proof had a serious flaw that killed the whole thing until he found a workaround. I don't remember how long it took him to get out of the jam, but I'm sure he would have handed his credit card to the Devil himself if he thought it might help.
    People who don't take advantage of the best available tools and techniques don't get to that level to begin with.
- bgwalter a day ago
  
  Collaborative environment meaning that any PFY employed by the "AI" providers can read your most intimate thought processes and keep track of embarrassing failures or misconceptions.
  - perching_aix a day ago
    
    The embarrassing failures or misconceptions of math experts with regards to research level mathematics? Definitely a serious problem.
    Though by your "Perelman and Wiles didn't need "AI" assistance" comment, you'd surely be there on the sidelines to ridicule them for each and every single one. I guess maybe that's where your concerns are coming from?
    I can practically see how these concerns of yours would suddenly evaporate if they started using self-hosted models instead... ... yeah, right, who are we kidding?
    
    TomatoCo a day ago
    
    If mathematicians aren't occasionally saying something that's obviously wrong to a 300-level student, are they really pushing the envelope? I'm just a programmer, but I find that my, and my coworkers, biggest insights always come right after we've said something seriously dumb.
    
    sandspar 21 hours ago
    
    "Progress by crashing into things then putting the broken pieces back together" is surprisingly effective.
perching_aix 2 days ago

Thankfully it's mathematics, so people powerscaling their idols, deifying them at the detriment of others, and putting terms into quotes mockingly, is not what determines whether results hold or not. Perhaps the only field not fundamentally shackled by this type of quackery, even if people try their hardest from time to time to make it so.
- bgwalter a day ago
  
  [flagged]
  - perching_aix a day ago
    
    It's fine, at least you admit that what you wrote was just to insult.
    For people who at least pretend to care to not think in strawmans, it's been nearly six years, and their deus has never exited said machina (if it's ever been in there to begin with, or anywhere else).