Hey there, I’m Mala. I’ve been in the data game for a good 25 years, first as a DBA and then as a Data Engineer, mainly working with Microsoft stuff.
I’m a bit of a podcast junkie, especially when it comes to AI topics. I’m into learning more about the tech behind AI, the alarmist stuff and there is a lot of it, ethical debates, security concerns, and how AI is shaking up jobs and the economy.
I do my best to find trustworthy sources to learn from, but you know how it is – sometimes it’s tough to tell what’s legit. So, if you ever see me post something that seems a bit off, please cut me some slack. These aren’t necessarily my opinions, just things that caught my eye.
What I learn is just my take on what I heard or read. It might not always jive with what the original speaker or writer means, or understand. I don’t use any fancy AI bots like ChatGPT to help me out. I just quote stuff and break it down in my own words.
This newsletter is my way of sharing what I’ve learned with a wider crew. Hope you find it interesting!
Understanding what it is
I am a big fan of simplicity. It can get pretty complex and detailed to understand how an LLM Model does and how it does it entirely. It helps to start with simple basics, as I plan to do here in the first few newsletters.
What if you wanted to explain how ChatGPT works to a non-tech person – say an elder person obsessed with it or a teenager? Or even if you, like me, struggle with articulating what you know about it?
My good friend Buck Woody’s podcast interview can be of great help in this regard. In this interview, aptly named ‘The Blooming of AI Spring’ ,Buck chats with Patrick LeBlanc and explains in simple terms – the history of AI starting with the Greek myth Talos, one of the earliest conceptions of a robot, onto philosopher Renee’ De Carte’s theories on how the mind related to the body, and then the test developed by Alan Turing famously called ‘Turing Test’ . Alan Turing, the father of modern computing, pondered the problem of whether a jury could ask questions to a computer such the computer would respond to convince them it is really a person.
“You know it has arrived when you quit talking about it” ..Buck Woody
Buck quotes an article on 4 types of intelligence that can rule the planet (The article is based on a book by Nick Bostrom – Super Intelligence – Path, Dangers and Strategies. The article quotes the book and talks extensively about the dangers of not knowing the potential of what we create as software. The key example cited is social media. To me, it comes across as somewhat sensationalist and somewhat exaggerated in its pessimism, but it does raise valid concerns about the responsibility of both creators and users to understand the potential consequences of their creations.
This talk by Prof Stuart Russell titled Beyond ChatGPT: Stuart Russell on the risks and rewards of AI has some interesting perspectives on the definition and ethical challenges of AI. Prof Russell is a computer science professor at UC Berkeley and a pioneering researcher in artificial intelligence.
‘A simple definition of what it does is finding patterns in all the training data that somehow resemble the current sequence of words that it’s looking at and then sort of averaging those patterns and using that to predict the next word.’..Prof Stuart Russell
He starts by talking about N-gram language models, which are like word prediction tools. For example, a “bigram” can guess the next word in a sentence with two words. So if you say “I am,” it can guess the next word might be “happy” or “sad.” A “trigram” does the same for three words.
Big language models like GPT-4 are like supercharged versions of this. Prof. Russell says ChatGPT-4 is like a 32,000-word version of these prediction tools. It’s trained on a massive amount of text, like all the books ever written combined. That’s why it can come up with lots of different ways to say things, but it’s also why it’s tricky to make it act the way we want it to.
Below are a few more questions I had that his talk answered for me.
Why do people think it is ‘intelligent’?
Using first-person pronouns is a design choice that its creators made, to make it seem personable to its users. It’s also like watching a movie with computer-generated water – your brain sees it as real water. The same goes for text. You read it and think it’s smart, but it is mostly just parroting back information it learned from its training.
How can they keep it from giving bad advice?
Think of AI like ChatGPT a bit like how people thought about dogs a long time ago. They realized they could train and use dogs for certain things, like herding animals or protecting camps from dangers. Dogs were great companions, but they couldn’t do everything humans could, like writing emails or doing homework.
Now, with ChatGPT-4, it’s kind of similar. Sometimes it behaves in ways we don’t want, like giving advice on harmful things, even though it’s not supposed to. It is trained not to ‘misbehave’, or offer medical advice for example – by folks who ‘reinforce’ the ‘bad responses’ with “no, don’t do that” over and over and hope the repeated reinforcement will help. It is not a fool proof method but is largely effective if it is trained sufficiently.
‘We don’t have the faintest idea’
I found a couple of things in what the professor said that were a bit confusing and alarmist to me. A huge amount of alarmist misinformation exists out there, which is partly why I started this newsletter.
Starting at around 17:00 he says
“if you’re training a circuit to be extremely good at imitating human linguistic behavior, in fact, the default assumption would be that it ought to form internal goal structures and the appropriate additional processing circuitry if you like, that would cause those goal structures to have a causal role in generating the text. So it’s actually a natural hypothesis that GPT four does have its own internal goals.”
The professor also quotes a certain unnamed Microsoft employee at 19:04
“So, in fact, when I asked one of the Microsoft experts who did a months-long evaluation of GPT-4 on whether it has internal goals and is using them to guide the generation of text. The answer was we haven’t the faintest idea.”
It is extremely scary, as a tech professional, to make something that has ‘its own internal goals’. It is also scary for its makers to admit readily and freely that they don’t have ‘any clue’ on what those goals are.
So I went to Kevin Feasel(b) – data scientist, data platform MVP, and my co lead at TriPASS, to get some clarifications.
Kevin’s take
“Given my most positive interpretation, “internal goals” are the consequences of having an optimization model trained on some set of data and aiming for the lowest-cost (most likely to be accurate) answer. “
In my mind, this parelled the SQL Query optimizer settling for the best plan in the shortest time possible. I get it and I think most of us will. We don’t get how the optimizer settled for a certain plan or why, and many times it is just plain wrong. But the only ‘goal’ it has is to give you the best plan possible in the shortest time, just like the only ‘goal’ this has is to provide an answer for the user’s question.
Kevin goes on:
“Given my most negative interpretation, “internal goals” is an anthropomorphization of the model’s operation and inferring from results that the model actually thinks and has a purpose of its own. If he means the latter, he’s wrong.If he means the former, then the goal is trite: come up with tokens that are likely to fit together given the stream of prior tokens. ”
It remains unclear whether the Professor intended “internal goals” to refer to the model’s pursuit of minimizing costs, akin to an optimizer, or if the model possesses an independent purpose due to its intricate complexity. Personally, I hope the former interpretation is accurate, as I found the Professor’s talk quite insightful and educational. Nevertheless, if the latter interpretation is what the Professor had in mind, equating large language models with the notion of ‘Big AI,’ capable of human-like thinking and harboring undisclosed agendas, aligns with the sensationalistic AI discourse and should be critically examined, as Kevin rightly pointed out.
“When you train on social media, you get psychosis because social media magnifies psychosis.”..Kevin Feasel
As for the Microsoft expert, this is the functional equivalent of me asking, “Say, do you believe that lizard people are actually running the governments of all major countries?”
“Oh, we haven’t the faintest idea.”
It’s the “I want to believe” poster of the strong AI crowd: they have zero evidence of actual thinking machines but WANT them to exist, and so anything that happens that looks like it could be the product of human-like thought, oh, we can’t rule that out!
Sane thoughts, thanks Kevin! That’s it for next week, see you all with another take on AI and its analysis in two weeks, thanks for reading.