As we head toward the end of the calendar year, a few items:
- Google published a new result in Nature a few days ago. This made a big news splash, including this accompanying press piece from google themselves, this nice article in Quanta, and the always thoughtful blog post by Scott Aaronson. The short version: Physical qubits as made today in the superconducting platform favored by google don't have the low error rates that you'd really like if you want to run general quantum algorithms on a quantum computer, which could certainly require millions of steps. The hope of the community is to get around this using quantum error correction, where some number of physical qubits are used to function as one "logical" qubit. If physical qubit error rates are sufficiently low, and these errors can be corrected with enough efficacy, the logical qubits can function better than the physical qubits, ideally being able to undergo a sequential operations indefinitely without degradation of their information. One technique for this is called a surface code. Google have implemented this in their most recent chip 105 physical qubit chip ("Willow"), and they seem to have crossed a huge threshold: When they increase the size of their correction scheme (going from a 3 (physical qubit) \(\times\) 3 (physical qubit) to 5 \(\times\) 5 to 7 \(\times\) 7), the error rates of the resulting logical qubits fall as hoped. This is a big deal, as it implies that larger chips, if they could be implemented, should scale toward the desired performance. This does not mean that general purpose quantum computers are just around the corner, but it's very encouraging. There are many severe engineering challenges still in place. For example, the present superconducting qubits must be tweaked and tuned. The reason google only has 105 of them on the Willow chip is not that they can't fit more - it's that they have to have wires and control capacity to tune and run them. A few thousand really good logical qubits would be needed to break RSA encryption, and there is no practical way to put millions of wires down a dilution refrigerator. Rather, one will need cryogenic control electronics.
- On a closely related point, google's article talks about how it would take a classical computer ten septillion years to do what its Willow chip can do. This is based on a very particularly chosen problem (as I mentioned here five years ago) called random circuit sampling, looking at the statistical properties of the outcome of applying random gate sequences to a quantum computer. From what I can tell, this is very different than what most people mean when they think of a problem to benchmark a quantum computer's advantage over a classical computer. I suspect the typical tech-literate person considering quantum computing wants to know, if I ask a quantum computer and a classical computer to factor huge numbers or do some optimization problem, how much faster is the quantum computer for a given size of problem? Random circuit sampling feels instead much more to me like comparing an experiment to a classical theory calculation. For a purely classical analog, consider putting an airfoil in a windtunnel and measuring turbulent flow, and comparing with a computational fluids calculation. Yes, the windtunnel can get you an answer very quickly, but it's not "doing" a calculation, from my perspective. This doesn't mean random circuit sampling is a poor benchmark, just that people should understand it's rather different from the kind of quantum/classical comparison they may envision.
- On one unrelated note: Thanks to a timey inquiry from a reader, I have now added a search bar to the top of the blog. (Just in time to capture the final decline of science blogging?)
- On a second unrelated note: I'd be curious to hear from my academic readers on how they are approaching generative AI, both on the instructional side (e.g., should we abandon traditional assignments and take-home exams? How do we check to see if students are really learning vs. becoming dependent on tools that have dubious reliability?) and on the research side (e.g., what level of generative AI tool use is acceptable in paper or proposal writing? What aspects of these tools are proving genuinely useful to PIs? To students? Clearly generative AI's ability to help with coding is very nice indeed!)
5 comments:
On the use of GenAI tools in research:
The adoption has come quicker than everyone thought it would be, yet I think it is just not being acknowledged openly. However, it seems great tool to increase ones productivity once you've figured out limits of each tool. Tools like Perplexity, paper-qa and github co-pilot have improved my ability to get things done.
I agree completely. I think it is bit frustrating that AI tools like Perplexity, Cursor and ChatGPT are seen as the enemy of serious science, when their output is indistinguishable from human researchers whose papers routinely contain mistakes (hallucinations), misquote references, and sometimes indulge in data manipulation, fabrication and concealment. It is like working with slide rules, because using calculators will blunt our superior intellect. Galactica, paper-qa and others have made it clear that factories of incremental research masquerading as serious science can be automated, for whatever it is worth. If we really want LLMs not to come up with random ideas and pretend they are true, we should hold ourselves to the same standards and write papers with the humility we seem to demand of LLMs, so that LLMs and graduate students trained on these papers imbibe the ability to put error bars on their creative ideas (aka hallucinations) and the tools to fact-check them.
Anon@11:59, can you say a little more about how you use Perplexity or paper-qa in ways that you find helpful? I've heard of Perplexity but I don't know what distinguishes it from the other LLM-based systems. I've never played with paper-qa. Can tools like that handle math (like, if I wanted to glean different expressions for some quantity from the literature), or are they entirely text-sensitive?
Anon@4:56, I think it all comes down to what people are able to learn and do. Knowing how to read the literature and learn from it has historically been a valuable skill; how much of that is retained or should be retained vs. being replaced with new skills (e.g. how to craft prompts to get the most out of LLM-based tools) is an open question.
I'm a basic user of AI tools, but things like ChatGPT have their place. It's definitely helped with quality control of my writing, where I can run drafts through it and have it point out issues I have sometimes bump into with phrasing, overly long sentences, etc. I'd say it's definitely a huge time saver here as I spend significantly less time reading and re-reading what I wrote to pick up on errors.
It works as a first check to double check methods, see if something is truly new, etc, although here it does have some issues and I need to be careful.
Also have used it to help with code for some more complicated data analysis and simulations. It's never perfect, but as someone without official training in Python, can definitely get me close enough to my end goal that I can work out the rest. For coding tasks, it does cut the time in half.
For my research, Github copilot has been an amazing boost. It's not perfect, but it's a wonderful assistant. It helped me in fixing a bug in a large open source project written in C++ (where my understanding is mediocre), where it provided me the syntax I needed to evaluate the problem; given what the required code looked like, I don't think I'd have ever figured it out on my own, even with StackExchange, etc.
I don't use ChatGPT or others very often, though I did have a nice conversation with ChatGPT about how various quantum mechanics textbooks cover different topics in the course I'm designing.
On the teaching front, I have found that Github Copilot can do most of the basic computational physics assignments that I give nearly on the first try. Some require a small amount of reprompting. As of when I last checked (in August), it still failed at more complicated questions, but I'm sure that's only a matter of time. So right now, complicated questions still evaluate/encourage/require student understanding. The issue is that I cannot *only* give students hard/intricate/complicated questions. They need to establish their skills using more basic/standard questions. And the AI tools excel at giving answers to standard questions that many people have answered before. This poses a dilemma of how to make sure the students actually do their own work on the foundational/basic problems.
I used to have weekly computer labs where the students would submit their assignments a few days after the lab. Because of AI, this year I made those lab assignments due immediately at the end of the lab (where I know they don't have access to the AI tools). This change made me shorten the lab assignments and decreased the quality of the students' work, since they didn't really have time to edit. I also assign a small number of more sophisticated problems, which they work on for a couple of weeks at a time; for those, AI tools are allowed to be used (though must be credited). I'm not really sure how this system worked, but that's where I am at this point.
The final exam used to be a 72-hour take home, on which they made code. Because of these AI tools, I moved it to an in-person 3-hour scheduled exam, but that means I couldn't make them actually code anything. I was not pleased with the exam that I wrote and didn't find the students' performance to be indicative of what I wanted them to learn from the course. Perhaps I could do better, but I believe I will not do that again. But I am not sure what I *will* do.
Post a Comment