GPT Prompt Workflow for Extracting Questions from YouTube Comments

Why pulling questions from YouTube is hard

First of all, YouTube comment sections are chaos. They’re basically public group chats with no topic boundaries, no formatting rules, and way too many sideways jokes. Some comments are actual questions. Some *sound* like questions but aren’t. Some have so many typos or slang strings (half of them are just the word “bro”) that it’s not even clear if it’s English anymore.

When I first tried using GPT to extract real questions, I thought it was going to be easy — just feed in a chunk of comments and say “Find the questions.” Nope. Not even close. The initial prompt I used was embarrassingly naive:

“`
Extract all questions from the following YouTube comments:
“`

That worked OK for squeaky clean tech tutorial comment threads. But the second I tried it on something like a celebrity interview or a controversial product review, it just broke. GPT kept returning statements that *weren’t* questions — stuff like “This guy is so real” or “Best interview ever 💯” — even though I asked for *only* questions.

Pretty sure GPT just sort of gave up and started hallucinating like, “Maybe the user meant *this* general question about the video topic?” 😑

The role of tone and structure in detection

Turns out, GPT kind of sucks at interpreting tone shifts in unstructured writing like random user comments. I did a test run with a batch of comments from a MrBeast video and noticed that mostly rhetorical or sarcastic remarks kept getting pulled as valid questions:

– “Why does this guy keep giving away money 💀”
– “Can I get $10k too, lol”
– “Where was *this* energy in school?”

Are those questions? Like — grammatically? Maybe. Realistically? Nope. They’re basically memes. But GPT doesn’t really know that unless you tell it super explicitly.

So I added qualifiers to the prompt. Something like:

“`
Identify *genuine viewer questions* — requests for more information, links, how-to guidance, or clarification about the video content. Ignore sarcastic or rhetorical comments.
“`

That helped a bit. But it was still yanking stuff like “Can I marry this guy 🤣” — which, obviously, isn’t useful unless you’re trying to build a creepy fan database or something 😬

Chunking and token fatigue problems

You’d think a huge model like GPT-4 could just handle a long YouTube comment dump all at once. But no — comment threads often have way more characters than GPT tolerates without tripping on its own feet.

After around 80 or 90 full comments (depending on length), GPT starts returning this weird behavior where:

1. It stops processing the text consistently halfway through.
2. It copies comments but doesn’t analyze them.
3. It starts giving generic suggestions like, “Some users might be wondering…”

So I had to start chunking. Basically, slice the comment thread into batches of 50–75 comments each. I wrote a small Google Apps Script to do this dynamically inside a Sheet:

“`javascript
function chunkComments() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getRange(“A2:A”).getValues();
var chunkSize = 50;
var result = [];
for (var i = 0; i < data.length; i += chunkSize) { var chunk = data.slice(i, i + chunkSize).map(function(row) { return row[0]; }).join("\n"); result.push([chunk]); } sheet.getRange(2, 2, result.length, 1).setValues(result); } ``` Column A = raw YouTube comments, Column B = chunked comment text. Easy paste into ChatGPT after that. Not elegant, but it got the job done ¯\_(ツ)_/¯

Building an actual working GPT prompt

This is what I ended up using — and it finally gave *mostly* good results:

“`
You will be given a list of YouTube comments. Your task is to extract only the direct, sincere questions asked by viewers. Ignore jokes, rhetorical comments, sarcastic remarks, or generic praise. Only include questions where the user clearly wants an answer, more info, or a link.

Output format:
– List of extracted questions (raw, no rewording)
– Only include actual questions asked in the comments below

Comments:
[INSERT COMMENTS CHUNK HERE]
“`

Important detail: leave a space between the instruction and the data. Otherwise, the model sometimes fuses them and just sits there staring into the void 🤷

When I ran this with multiple chunked batches, the question extraction rate improved a lot. It even started dropping fake questions like “Who edited this?? So good!” which, yeah, technically ends in a question mark but still not a real question.

If you want clickable results you’ll need labels

At some point I realized: extracting the questions is cool, but pretty much useless unless you tie them back to the actual comments — for context and timestamping especially.

So I started tweaking my Sheets script to include the original comment index and timestamp (if available). That way when GPT returned a question, I could say:

“Okay this question came from row 67, timestamp 04:23, username: TonyWaffles9.”

Eventually I built a Finder tab inside Glide that links the question back to the video timecode. That way I could treat the whole thing like a dynamic FAQ for the video — people click questions, it jumps to the relevant part.

It’s incredibly overbuilt. Did I spend 3 nights debugging a glide table join that had one invisible space character in the ID match column? Yep 🙂 But it works now.

Cleaning bad extractions without deleting everything

Even with the better prompt, GPT still sneaks in non-questions. Sometimes the comment *feels* like a question but lacks actual intent — like this one:

> “This must’ve taken hours to edit wow”

No question mark, no implied question. GPT still yanked it and said:
> “How long did this take to edit?”

Which is totally fabricated. That’s not what the commenter said. It’s a generative leap — not extraction. Not okay. That’s the whole wrong direction.

So now I force GPT into only echoing word-for-word questions. That’s a hard limit I add:

“`
Only include literal, word-for-word questions. Do not reword, paraphrase, or guess at implied questions.
“`

That sentence alone cut the bloat by like 60%. The output got shorter — but way more precise. I’d rather have five good questions than forty maybes.

The weird part where it actually kinda works

Yeah, so after all that wrangling, it actually works. Not flawlessly — but good enough that I’ve made it a mini pipeline:

1. Grab raw comments with YouTube API
2. Dump them into a Google Sheet
3. Use my script to chunk
4. Paste batches into GPT with the hard prompt
5. Output into AirTable or Glide to link back to the video

It takes about 10 minutes total per video now. Which is wild considering the first version took me all Saturday and broke at every step.

The real kicker? I tried reusing this prompt on TikTok comments. Immediate failure. Whole different chaos energy over there. Might try Tumblr next lol

Notes for people trying this right now

– Always cap GPT input chunks under 2000 words or ~10k characters.
– Repeat instructions explicitly. GPT forgets halfway through.
– The more vague your instruction set, the more guesswork GPT adds.
– Never assume GPT knows tone or context. It doesn’t.
– Want fewer hallucinations? Demand literal output, not “inferred.”

Also if you’re using ChatGPT, pay for GPT-4. The free tier is just wobbly on this kind of edge parsing — it tries too hard to be clever. Which is charming for stories but terrible for data hygiene.

Anyway. That’s what got me through the last batch without losing my mind 🙂

Leave a Comment