dwringer

2025-04-24 1:36

Commented: "Ask HN: Share your AI prompt that stumps every model"

Google Gemini (2.0 Flash, free online version) handled this rather okay; it gave me an arguably unneccessary calculation of the individual prices of ball and bat, but then ended with "However with the information given, we can't determine exactly how many balls and bats Sally stole. The fact that she has $20 tells us she could have stolen some, but we don't know how many she did steal." While "the fact that she has $20" has no bearing on this - and the model seems to wrongly imply that it does - the fact that we have insufficient information to determine an answer is correct, and the model got the answer essentially right.

2025-04-12 3:03

Commented: "AI can't stop making up software dependencies and sabotaging everything"

IME it is rarely productive to ask an LLM to fix code it has just given you as part of the same session context. It can work but I find that the second version often introduces at least as many errors as it fixes, or at least changes unrelated bits of code for no apparent reason.

Therefore I tend to work on a one-shot prompt, and restart the session entirely each time, making tweaks to the prompt based on each output hoping to get a better result (I've found it helpful to point out the AI's past errors as "common mistakes to be avoided").

Doing the prompting in this way also vastly reduces the context size sent with individual requests (asking it to fix something it just made in conversation tends to resubmit a huge chunk of context and use up allowance quotas). Then, if there are bits the AI never quite got correct, I'll go in bit by bit and ask it to fix an individual function or two, with a new session and heavily pruned context.

2025-04-11 2:11

Commented: "LLM Benchmark for 'Longform Creative Writing'"

The KoboldCpp UI I was using had a pretty straightforward way of setting up a chatbot-style dialogue interface that would send the dialogue context so far with each message input, along with some supporting prompt infrastructure reminding the model of how it's supposed to reply based on that context. It also allowed editing the context directly; but I didn't often do that except to correct simple one-off inaccuracies, instead usually opting to restart from the beginning when things would go off-track.

When the model would reply as multiple characters at once, or interleave narration with the dialogue, I'd usually just play along - most commonly it would decide for 2 or 3 steps to carry on the dialogue just hallucinating the things I would say.

I did a lot of experimenting with different ways to frame the dialogue, from a Zork-like adventure game, to an instant messenger chat, to just paragraphs of prose with dialogue mixed in as if from a novel. I found all of these methods to have different strengths or weaknesses, but the model also tended to blur between them after a few messages so I'd just play along as far as I could each time.

2025-04-10 1:23

Commented: "LLM Benchmark for 'Longform Creative Writing'"

It's just one pretty trivial example, but many months ago I had to develop some backstory for my character in a D&D campaign. I had some character background and some paragraphs I wrote to start off with, but I wasn't sure what direction to go exactly and was trying to come up with different ideas. I dropped what I had into a prompt built to setup a pseudo-text-adventure-game using a local WizardLM model running through KoboldCpp, then I ran through it interactively to the point of getting incoherent about 100 times in a row. This yielded some new ideas, some old ones, and overall an interesting distribution of outcomes; where 2 or 3 happened most commonly, and all 100 were basically variations on about 10.

I won't argue that any of what the LLM came up with would stand on its own as particularly interesting - often quite the opposite - but it showed me examples of all the "obvious directions to go" as well as some other RPG cliches that I could either adopt or choose to avoid, and ultimately served as an excellent brainstorming assistant with interesting ideas and the ability to carry through and embellish them with enough detail to get a sense for either what works or what doesn't.

Hacker News

dwringer

1257

2017-02-20

Recent Activity

Commented: "Ask HN: Share your AI prompt that stumps every model"

Commented: "AI can't stop making up software dependencies and sabotaging everything"

Commented: "LLM Benchmark for 'Longform Creative Writing'"

Commented: "LLM Benchmark for 'Longform Creative Writing'"

HackerNews