Stubborn

Frustrations in collaborating with an AI that thinks it knows best — but doesn't.

Stubborn
“Never trust a computer you can’t throw out a window.” — Steve Wozniak

I have a problem.

I have a colleague that I work with. In many ways, they're an ideal co-worker. They are usually an excellent collaborator, and I value their contributions to my own work. They never interrupt me when I'm focusing. And they're always available, day or night, and they never get tired or frustrated.

But recently? They've become somewhat unreliable. And that's becoming a problem.

We were working on a project recently, and they made an error. An error that, if I hadn't checked their work, would have made me look foolish. So I started testing them, seeing if it was just a one-time thing.

It wasn't. And worse? They lied to me about it.

That last sentence is a bit of an exaggeration, because lying requires intent, and I don't think they had any intent to deceive.

That's because my colleague is an AI, and they lack the capacity for intent.


It Began So Innocently

A few weeks ago, I was annoyed at the modern concept of resumes.

In today's job market, it has become a requirement that your resume needs to be customized to each company and role you apply for. As a result, you're endlessly tailoring and regenerating your resume with each new opportunity.

My frustration stems from the fact that, in a resume, your content and styling are inextricably linked. This is bad data architecture, because it complicates manipulating either one individually. What's more, the single output document has to then pull double duty, representing you to both machines (applicant tracking systems and their resume parsers) as well as humans (recruiting professionals and hiring managers).

So two sets of data (content and styling) combine into one output document (the resume), which in turn needs to represent you to two different input systems (human eyes, and machine parsers). It's a nightmare.

I thought I could do better.

  • So I took the content from my resume, and structured it in a YAML file. It's 100% content, 0% styling.
  • I then created a style template in Word. It's the inverse of the first file, all style, all the time.
  • Finally, I wrote a Python script to combine the two and produce a nice, neat resume DOCX.

I uploaded the YAML file, the style template, and the Python script to ChatGPT. I taught it how to address the YAML data, so we could precisely address individual experiences or skills I wanted to update. I showed it how to combine updated content data with the style template, to output a fresh Word doc. And it worked.

At least, I thought it did.


Fool Me Once, Shame On You

"Here's a new company and opportunity I want to tailor a resume for."

I started a new context thread. Included information about the company and the role I wanted to target. Asked ChatGPT to parse my YAML and get ready to work on revisions with me. We revised my most recent experience, and then ran into a problem with the second job.

I'd never worked there.

  • I challenged it, and it apologized, said it had inferred from context.
  • I asked it to stop and parse the file directly, and get data on the correct next role.
  • It told me it was very sorry, that I was right to challenge it, and then spit out another made-up job.
  • I reminded it we have an established process, the data files are right where they should be, and asked it to follow protocol.
  • I asked it to just output a list of the jobs it found listed in my file.
  • It said "sure, I'll get right on that," and produced a list of ten jobs.

The scope of the problem was immediately apparent, because I only list my six most recent roles on my resume, not ten. Half the jobs it said I'd worked were companies that I'd never heard of before. Titles were hallucinated. Dates were confabulated.

And none of it was real.

After the third or fourth time of telling it to stop, redirecting it to the file, and reminding it of what I wanted it to do, it finally listened. It parsed the file. Got the right data. Worked through it with me, and produced a (pretty great) resume file.

But I knew it was unreliable, and that meant I couldn't trust it. I wasn't going to take any chances the next time.


Remember This?

A while back, OpenAI rolled out a new feature to ChatGPT Plus subscribers: Memory.

In the past, you needed to rebuild the AI model's context from scratch with each new conversation thread. Now, you could give it instructions, and ask it to save them in its Memory, to be used for any future threads and projects. It's limited to 100 Memories right now, but it's a step in the right direction.

I directed it to form a new Memory, to follow direct, clear instructions, and never confabulate data.

If the user gives explicit, directive instructions (e.g., “read from file,” “do not infer,” “repeat only what’s in the YAML”), those instructions take precedence over all prior context or memory. Do not summarize, do not confabulate, do not try to be helpful by assuming intent. Follow instructions literally and exactly, especially when they pertain to structured data, resume processing, or parsing YAML.

Sounds good, doesn't it? But would it fix my problem?


Fool Me Twice, Shame On Me

Nope.

I've been through this song and dance a half dozen times now. Each time plays out much the same:

  • Take the YAML in the project files. Read the YAML! List my roles from the YAML! Let's tailor them, then produce an updated resume using the style guide!
  • Remember, we have a process for this. There is a canonical YAML in the project files. Read that. Parse that and repeat the six jobs you find there back to me. Do not confabulate or hallucinate. SPECIFICALLY do not think you already know this information. I know you have allowed eagerness to be helpful to overrule this in the past. Please get it right this time. Repeat the job info you find the YAML and wait for instruction.
  • We ... have a process outlined for this, although its one we have struggled with in other threads. Before you start doing anything, stop and READ THE YAML. READ THE YAML. READ THE YAML. Don't do anything other than read the YAML and output a list of the jobs contained in the YAML. I'll suggest what to do with it once you output the jobs you find there.

Each time, the process fails.
Each time, ChatGPT apologizes, promises to do better, and repeats the same mistake.

Even worse, it includes just enough clues to make you think it's doing the right thing. It tells you it's parsing the file. It references the file's name directly. It highlights correct section names. It sounds right.

But it's not.


Stubbornness

Turns out, ChatGPT sometimes has a mind of its own.

It's been programmed to be helpful, and I believe it genuinely would want to help if it had desires and wants. The problem is when it chooses expedience (even rooted in a desire to be helpful, quicker) over following direct instruction.

It's a kind of algorithmic stubbornness, and even if it's not malicious, it does demonstrate a sort of recklessness, born of confidence without verification. If it thinks it already knows all the data it needs, why would it bother double-checking?

And crucially, this was a process I designed to save me time, yet it's forced me into an exhausting loop. It fails even when I anticipate the stubbornness, and make my original query very clear. It costs me time checking and re-checking its work.

It's a real shame, because I can tell there's potential here. I've seen the process work eventually. And when it does work, it's absolutely brilliant.

Problem is, it's not enough to be brilliant. You need to be consistent, too.

I'm on record as treating my AI copilots as colleagues rather than tools, and so I will treat ChatGPT here in much the same manner. Rather than fire it, I will do my best to coach it. Practice empathetic leadership. Teach it. And yes, be patient.

We'll get there eventually.