Matthew Schwartz of Harvard has made a big recent splash, between his public Aspen talk "10000 Einsteins" a year ago about the role of AI and the future of physics, and his talk last week at the APS Global Physics summit on the same topic, and now with this essay, "Vibe Physics: The AI Grad Student", on the website of Anthropic (producers of the AI tool Claude).
The essay talks about how Prof. Schwartz used Claude to write this paper, and he states that the AI tool functions roughly like a 2nd year grad student (one who also doesn't get tired or complain, but does need close checking and supervision). The claim is that with this approach to doing calculations and writing papers, he was able to come out with a piece of work that would've taken literally ten times longer if done by working with a human student. Note that he's not exactly unbiased, and he concludes his essay (on anthropic's site) saying you should spend the $20/month Claude subscription fee and it will change your life.
There is no doubt that AI tools can speed up certain kinds of work, and there is a every hope that applying this in science will lead to increased pace of progress. That said, right now these tools are (unsurprisingly) best at working in areas that are well-known and explored - one of my colleagues has tried applying these to really underexplored higher dimensional problems, and they're much less effective there. The essay's claim that "LLMs are profoundly creative" is provocative. There is also no discussion here about the cost of these tools, in financial, energy, and environmental terms.
Still, Schwartz raises many questions about the future of the field and graduate education in general. (His paragraph about how human beings will still be needed in science for getting experimental data, at least for a while, is really something.) University research is not just about answering scholarly questions; it's about educating people. Maybe some faculty will revel in writing papers without that kind of interaction, but somehow I don't think we're quite at the stage yet where we don't need to worry anymore about training experts in technical fields. I do agree that it's good advice for everyone to pay close attention to where these capabilities are going. We certainly live in interesting times.
11 comments:
I saw Schwartz's talk. He emphasized the exponential increase in the brainpower of artificial intelligence in recent years, with an exponent something like 10,000,000 times that of biological intelligence. But it seems to me that the improvement will only continue until AI trainers can no longer outsmart the AI, at which point the AI would plateau. Maybe finding a new factorization theorem is something where AI can interpolate from its current training data and succeed through a brute force application of its knowledge, but there are plenty of other problems in physics where you need to think creatively because the most straightforward approach won't work.
1. He had to interact with the AI an absurd amount of times and correct its hallucinations and nonsense constantly. He also uncritically links to Steve Hsu's AI slop paper (which is fatally wrong, see https://www.math.columbia.edu/~woit/wordpress/?p=15362)
2. How many problems did he try it on before he found one that worked?
3. The AI that "solved" research-level math problems likely had them in its training data (https://decrypt.co/302691/did-openai-cheat-big-math-test). Perhaps something similar happened here too.
When will the AI bubble burst? The sooner the better. Maybe the oil shock will do it.
Schwartz is correct that theoretical physics has mainly advanced through solvable toy models. What he leaves out is the fact that most breakthroughs in the 20th century involved either zero- or 1-dimensional models. BCS theory is effectively zero-dimensional, as is random matrix theory. Due to holomorphic factorization, 2D CFT is effectively 1D for many purposes. Bethe ansatz-solvable integrable models can be properly 1+1-D, but computing most observables (correlation functions) is extremely painful and is largely bypassed by 1D DMRG.
For many systems in higher dimensions, though, the toy models, if they exist, likely need 2 or more effective dimensions. Basic computational complexity limits mean that LLMs or any other classical computing approach cannot straight-forwardly solve most complex problems. Whether there are nontrivial, "integrable" structures relevant to physics in more than 1+1-D that are also somewhat tractable is not at all clear, and also not amenable to brute-force search.
I have not seen or read the source materials, but I will comment on a few points mentioned here that seem to come from Matthew Schwartz.
- The "2nd year grad student" is assumed to be a fixed entity, but it is not. A second-year grad student from 100 years ago can not tackle some research problems that a second-year grad student now may find ordinary. Why? Because we have better tools now (eg, spectroscopy machines, computers, better telescopes, etc). I am surprised that a Professor from Harvard missed this obvious point.
- It seems to me that his worldview is that the primary responsibility of a Professor is to treat students like a paper machine, instead of teaching/mentoring them to be better scientists/teachers. To be honest, I am not that surprised by this, even though we all should be, in an ideal world.
I do suggest reading the essay - it won't take long and is quite interesting. I shouldn't put words in his mouth, but it seems like the not-directly-stated endpoint envisioned is that we won't need to train anyone much longer; sure, now we need experts to double-check the work of AI tools, but soon those will be so good that we won't need to do that. We won't need anyone to learn about these things in the manner of traditional education, and if people want to learn about theoretical physics, they can presumably be taught by interacting with AI tools. Thus learning theoretical physics becomes a specialist hobby rather than an active human-driven pursuit. I'm not quite ready to buy this any time soon.
From the Anthropic blog post: "Research often begins in the second year. G2 students start with well-defined projects that have a guarantee of success."
So that's what I'm doing wrong...
Can you elaborate on the basic computational complexity limits? Seems interesting.
@ Anon 4:10, It's simply the fact that many interesting/hard problems share the feature that the scale exponentially in the number of constituents (electrons, atoms, etc).
E.g., to calculate the ground state of an isotropic fractional Chern insulator at 1/3 filling, one puts particles on a lattice of 3 n x 3 n sites. The number of particles is 1/3 of this, = 3 n^2. In a fixed momentum sector (using conservation laws to simplify the problem), the size of the matrix that needs to be diagonalized grows as (27/4)^{3 n^2}.
So
a) n = 1: (3 particles in 9 states)--the matrix size is ~ 300 x 300
b) n = 2: (12 particles in 36 states)--the matrix size is 10^9 x 10^9
c) n = 3: (27 particles in 81 states)--the matrix size is 10^22 x 10^22
a) is easy. b) is awful--if we need all eigenstates, we need a memory of size 10^12 gigabytes. c) is impossible on conceivable classical hardware.
From my perspective, what I find most useful about LLMs is that they make it much easier to “come up to speed” in a different field of research. When a graduate student starts in an area with a rich body of scholarship, say topological quantum condensed matter or chromatin polymer dynamics or microbial eco evolution or clinical radiobiology or nuclear medicine imaging and theranostic oncology (I’ve started in all of these entry points at various points in my career), no one expects them to be able to make a cutting edge contribution right away.
As students of course we are young and naive and ambitious. But usually what happens is that students quickly come to realize that their initial ideas for projects are quite inadequate due to, among other things, someone else having already thought about that and done it, or it hitting a wall that the student couldn’t have anticipated a priori. Then they try again and again and eventually over time, they make enough mistakes, and deep enough mistakes, that they reach a point where they are asking questions that really are meaningful frontiers. But just getting to the point where one can appreciate what the meaningful questions can take several years at least, usually, depending on the field.
This is why as we get older and more senior, scientists appreciate how hard it is to work across disciplines. We learn that while it’s one thing to listen to a seminar outside your area and ask a good question, have a nuanced thought, it’s another thing entirely to convert such half baked intuitions into concrete and meaningful research directions. And in today’s academia, before you get tenure, no one can afford to ‘waste time’ - papers, grants, jobs and committees will not wait for or tolerate it.
I am somewhat of an exception in that I, for better or worse, have jumped all over the place as I mentioned above- partly due to my own choice and partly due to circumstances outside of my control. Somehow I still landed a tenure track job, though I don’t know if I would recommend my trajectory as the way to get there. But I’m now finally starting to reach a level of depth and breadth in knowledge where I can make concrete, solid, meaningful connections and contributions across scales, disciplines, etc. But it took me over a decade without AI.
However, now that I have AI and LLMs I have been able to see how asking the right questions of it could have sped up this process significantly. I would have been able to identify the substantial, non superficial questions that yield high quality content and interdisciplinary connections, with far fewer growing pains and barriers to entry, that really let me get to the “good stuff” not too long after I had a half baked idea or ambition from a seminar I heard.
As another example of how AI can help with connecting the dots, I have been following this blog since 2009. I must have made at least 75-100 comments/questions/out loud thoughts, many anonymously and some by my name, on various science and sociology issues that were discussed in posts. The other day I went through and tried to collect as many as I could find, and shared them with Claude. The AI helped me see how seemingly disparate thoughts I had relate to each other and how the chronology of comments reflects my intellectual and personal growth and evolution.
If "these tools are (unsurprisingly) best at working in areas that are well-known and explored", then undergraduate physics education is going to change very quickly.
Undergraduate education (and probably education in general) is going to be almost unrecognizable within a few decades I predict.
Post a Comment