Book Review: “Chat GPT and the Future of AI”

I spotted “Chat GPT & the Future of AI” by Terence Sejnowski in Posman Books of Pittsburgh on a particularly hot summer day and read it standing up in the air conditioning for about 10 minutes before eventually deciding to bring it home with me. AI discourse has been flooding the enterprise tech sphere for the past couple of years and is now being adopted (or, rather, adoption is being attempted) by all kinds of businesses, some with more success than others. I wanted to actually understand the underpinnings and history of where this all came from, and this book seemed like exactly what I was looking for. My thoughts below:

Summary

⭐⭐⭐/ 5

A solid technical survey with identity issues

The stated goal of this book is to “give the reader a perspective on what is happening in AI behind the scenes,” and it cuts right to the chase, opening with a basic introduction to AI, machine learning, and LLMs. ChatGPT itself makes its appearance early on; Sejnowski uses it to both illustrate the technology in action and to contribute as a “co-author,” which I’ll come back to in a minute.

The book really does succeed at explaining how AI and ML work, as opposed to a lot of the marketing material and rabid LinkedIn screeds out there that makes everything seem a bit too much like magic. Sejnowski gets down to brass tacks on how the models are created from the data, how there needs to be a LOT of data, and then how there’s so much data that the model performs best when instructed how to weight and present that data to a particular user. These models communicate in natural language, which means they can be widely leveraged. In short, this is different from what’s come before.

The middle of the book has a section that goes deep into transformers, the underlying machinery that powers LLMs. I came to this book with enough background in math and science to understand the broad strokes, but unless you’re really up to speed on high-dimensional geometry and topology, not all of this section will be fully understandable. This is mostly fine; you don’t need to understand the mathematics in-depth to get the gist of what’s being described.

However, readers approaching this from a mostly non-technical background might struggle here, and this leads into an area where I think the book falters– identifying and writing for a specific audience. And I think this is a direct consequence of Sejnowski relying heavily on AI itself when creating this book, to the point where he credits ChatGPT as a co-author. Excerpts from conversations with ChatGPT feature heavily to illustrate the technology in action, to summarize a chapter, to create a list of clarifying questions, and do multiple other jobs within the context of the book. These excerpts are clearly denoted, so there’s no subterfuge, but they contain a lot of padding and they would instantly be recognized as AI-generated even if they weren’t specially formatted. Anyone who has conversed with ChatGPT for more than a few minutes will recognize the language– lots of restatements and looping summaries, all technically correct but not especially precise, and they muddy the tone of the book. Sometimes it feels like anything that might be relevant is looped in because the AI is trying to be helpful.

Sejnowski is very optimistic about AI, sometimes to the point of attributing more novelty to the technology than I think is warranted. For instance, he marvels at AI’s ability to display empathy in text, but once you understand that LLMs are prediction engines that can be trained on empathetic language, it isn’t especially mysterious. Knowing which words come next in an emotionally charged situation is precisely how a model trained on natural human language should behave (or try to behave); it’s not evidence of emergent human-like inner states so much as a different type of text prediction and mimicry. This tendency to enthusiastically overinterpret model behavior occasionally undermines Sejnowski’s otherwise strong scientific framing.

Since the book is unclear about its intended audience, I’ll attempt to define it:

  • Readers with some familiarity with at least mid-level undergraduate mathematics (multivariable calculus and above)
  • People who know the basics of AI but want a single-source technical survey
  • Anyone comfortable reading some chapters like an engaging essay and other chapters like a semi-formal textbook
  • Technical professionals (engineers, PMs, data people) who want conceptual clarity of what’s happening under the hood and how it developed.

3 out of 5 on Goodreads.

Leave a comment

Anne Guzzi

My work centers on keeping large amounts of data secure, reliable, understandable, and compliant. This includes leading and supporting migrations from legacy resources, managing data ops, products and programs, and building and maintaining strong reference and master data foundations.