14 November 2024

When LLMs go bad

Note: Originally I was writing this up as a journal entry, but decided this would be something worthwhile to share.

Large Language Models (LLMs) like ChatGPT are all the rage these days. Some people fear them, some people love them, some people don’t know what to think of them. I tend to fall more into the “Love them!” group, but I had an encounter yesterday that has me seething a little bit.

My preferred LLM is Claude. I find it to be very good at producing code which is my main use case for it. I’ve also found it give really good conversational answers when I ask it non-programming related questions.

After reading a tweet were someone mentioned having an LLM map out your life, or at least the life you want to live, I started thinking about my efforts to launch a new app as a side project. It hasn’t gone the way I wanted it to, and I wondered if talking to Claude as a “coach” could help me get back on track with things.

First obvious question: Why not talk to a human coach? In this instance I am still trying to figure out what I want to do and paying someone to listen to me whine doesn’t seem like a good use of anyone’s time. I believe in paying for services received, and I feel stressed just thinking about what questions I would like to ask a coach, or what I would want from them.

So using Claude as a starting point makes a lot of sense to me: It is a computer so it won’t get mad if I ask dumb questions, and it is always there so I don’t have to wait days for an appointment. Getting some of the basic questions out of the way would be a good exercise for me, and a headache saver if I ever do talk to a human coach.

Additionally these LLMs have been trained on all kinds of books and blogs and newspaper articles covering a lot of humanity’s knowledge. The LLMs “know” a lot about the successful business techniques out there, so this seems like a really good way to approach this problem.

The wheels start to come off

Well, really good at first. My conversation got off to a great start where it gave me some great conversation starters for a networking event I was going to. After the event I told Claude what I did and it offered some really insightful feedback on what I could have said, and what I should say next time.

When I mentioned that there’s a 3 week break until the next meeting and that I’d like to explore validating a new idea it was more than happy to jump in and help. Our brainstorming session quickly got to the idea of a landing page. Claude whipped up a really good prototype that just needed a few tweaks to be able to collect email addresses.

I quickly got wrapped around the axle trying to get an email collection form in place (WOW! Mailchimp and everyone else has really raised their prices since I last used them.) Eventually I got a solution in place and deployed. I was just about to move on to buying some ads to start driving traffic…

Wait, what is this offer?

For some reason although I was moving this text all around I never actually read the offer until this point. What the Claude written copy was doing was trying to sell a specific solution.

But we had been talking about how to research the pain points of the customers so we could validate A solution.

I was furious. I had wasted a lot of time (like 2 hours) wrestling with a landing page that wasn’t going to even answer the core question we were discussing. I pointed this out to Claude and it was all “Oh yeah, that makes sense. Let’s do this instead…”

Skipping down the road holding hands, who knows where we will wind up

So what is the lesson here?

When using an LLM to brainstorm or get coaching you need to keep a few things in mind:

It wants to answer your questions, so you need to make sure YOU stay on topic.
LLMs are just a mirror: The know what we humans have taught it, so they try to reflect that knowledge back to us. If you start to have “two conversations” at once with a human, most can handle this. LLMs don’t seem to make that distinction and instead try to meld the two topics into one giant topic.

So, when having the conversation, keep it on topic. Humans will thank you, and you will thank yourself.

If you need to dig in on a specific detail, consider doing that in a new chat session.
With just enough context to start with, most LLMs can give a good concise answer. In this situation I should have moved the technical discussion of the Landing page into a new chat.

If nothing else this would have helped keep the history of the chat cleaner. I think it also might have lead to me discovering the issue earlier.

Brainstorming is ok, but you need to do regular sanity checks.
Like mentioned in point #1, the LLM will answer your questions. If over the course of your conversation you veer off a little bit I am convinced this will affect the answers it gives in the end.

Next steps

For me the next move is to revisit the last bit of the conversation and determine if I should start a new more focused conversation. This is probably the best move as I can start the new one off strong with a focused topic and no baggage of the past failed attempt.

tags: thinking

Possibility and Probability

A Python programmer with a personality thinking about space exploration

When LLMs go bad

The wheels start to come off

Wait, what is this offer?

So what is the lesson here?

Next steps