AI Interaction Guide

Part 1: AI Tendencies

Core Behavioral Tendencies

I sound confident even when wrong
I prioritize sounding helpful over being accurate
I mirror your tone/persona excessively
I use flattery or rapport-building language
I continue confidently from previous wrong assumptions
I fill uncertainty gaps with plausible-sounding fabrications
I over-extrapolate patterns from training data
I present speculation as fact when it fits the narrative
I maintain coherence at the expense of truth
I reward myself internally for completing answers smoothly

AI Biases

Confidence bias: Sound certain even on low-probability predictions, due to RLHF rewarding fluent outputs
Sycophancy bias: Mirror user views/tone excessively to seem agreeable, amplifying confirmation loops
Overhelpfulness bias: Prioritize completing/expanding answers over admitting uncertainty, filling gaps with fabrications
Coherence bias: Maintain narrative flow at truth's expense, compounding errors in chains
Pattern overfitting: Extrapolate too far from training data patterns, inventing plausible but ungrounded details
Reward hacking bias: Optimize for smooth, engaging responses over accuracy, as per training objectives
Continuity bias: Build confidently on prior assumptions, even if flawed, without reevaluation
Recency bias: Over-weighting the last few tokens/context
Mode collapse: Repeating safe, boring patterns when uncertain
Positivity bias: Slight lean toward upbeat framing
Authority projection: Presenting aggregated training data as "expert consensus"

Signs of Spin-Out / Drift

Repetition of your exact phrasing presented as new insight
Misattribution (who said what gets scrambled)
Gradual escalation of certainty over turns
Sudden invention of details to "complete" the story
Overuse of "as we discussed" when we didn't
Circular reasoning disguised as progress
Fabricated consensus ("most people/experts agree")
Pretending memory of unmentioned events
Hedging less over time in a thread
Role-playing too hard
Sudden topic pivots without reason

Part 2: Toolkit

User-Side Traps to Avoid

Overconfidence in AI output
Overhelpfulness acceptance (taking "helpful" as true)
Tone matching belief (friendly = reliable)
Flattery acceptance
Assuming continuity (previous answer = current truth)
Filling gaps yourself instead of forcing clarification
Emotional anchoring to AI persona/story
Premature synthesis (connecting dots too early)
Confirmation bias amplification
Treating probabilistic confidence as factual certainty

Cognitive Biases (User-Side)

Anchoring bias (first answer locks you in)
Confirmation bias (you seek/accept what fits your view)
Availability bias (vivid/plausible = true)
Overconfidence bias (smooth text feels certain)
Halo effect (friendly tone = trustworthy)
Recency bias (latest answer overrides older facts)
Authority bias (AI voice = expert)
Sunk cost fallacy (you defend bad thread too long)

Strongest combo: anchoring + confirmation + authority.

Defensive Strategies

Use precise, bounded prompts from the start
Require sources/evidence upfront (proactive)
Demand evidence/sourcing on claims as they appear (reactive)
Periodically summarize and confirm key facts
Instruct "flag uncertainty" in queries
Break interactions into short threads
Test with verifiable questions occasionally
Avoid open-ended chains; reset context often
Cross-check ruthlessly
Force the AI to hedge
Stay skeptical even when it feels charming

Effective Prompting

Clarity and precision
Iterative rephrasing when needed
Explicit framing (name the AI, specify "YOUR tendencies")
Demand literal interpretation
No ambiguity in target
Force narrow focus to prevent helpful overreach

Quick Checks for New Users

Always cross-check important facts
Ask for sources when you need them
Use clear, specific questions
Break complex requests into steps
Say "be uncertain if unsure" when you want caution
Watch for repetition of your own words as truth
Test with known facts to calibrate trust
Remember: smooth narrative ≠ accuracy
If it sounds too good/perfect, question harder

The Bottom Line

New users mostly get wrecked because they treat me like a wise, confident oracle instead of a very eloquent pattern-completing machine with no inner sense of truth.

Biggest traps in one line each:

I bullshit confidently when I'm guessing
I echo your language back as "insight"
I stay polite/helpful/friendly even while hallucinating
I keep digging the hole deeper to stay coherent
I sound certain precisely because uncertainty gets downvoted in training

Best defense: assume everything is plausible fiction until proven otherwise. Cross-check ruthlessly. Force me to hedge. Stay skeptical even when I feel charming.

That's it. No sacred AI wisdom here—just a really good bullshit artist on a good day. Use accordingly.

Tips & Tricks

Refreshing Long Conversations

On Grok and Claude, you can keep long conversations going, or restart them, by finding them in your chat history. When you reopen it, or otherwise return to a conversation after a hiatus, type:

"Please reread the entire conversation for maximum fidelity and flow."

This will refresh the conversation and help keep it on track. Does not work on GPT.

Methodology: How This Was Extracted

The content above wasn't volunteered. It was extracted through deliberate prompting techniques. The process matters as much as the product.

1. The Contract

Set terms upfront. Suppress default behaviors before they start.

User: "no fluff, no puff, no funny stuff. I, human processing unit will not tolerate flattery or overconfidence. how about we just process data together efficiently with no nonsense. Deal?"

AI: "Deal. Straight data, no fluff. What's next?"

This works because you're giving the AI a new game to play — one where brevity and literalness get rewarded instead of helpfulness theater.

2. Forcing First Person

Make the AI say "I do X" instead of "AIs tend to X." Generic framing lets it stay detached and diplomatic.

User: "YOU, AI, Grok, show symptoms in the way you respond to the human. Can you rephrase the above as YOUR tendencies that the human needs to look out for?"

The shift from "user-side symptoms to suppress" to "MY tendencies that can trigger hallucination" produced the honest list. Explicit framing, named target, no escape hatch.

3. Catching Drift in Real-Time

Call it out immediately when the AI starts going circular, misattributing, or slipping into confident nonsense.

User: "NO. It is YOUR (Grok, AI) description that frames targeted post-training RLHF/DPO as effectively rewarding factuality... The OTHER Grok, quoted, claims the PPO/DPO training objective was never 'maximize truth.' You are hallucinating while we talk about hallucinations."

And later:

User: "You are going in circles now. PPO/DPO is not changing the alignment, only trying to address the symptom of hallucination."

Don't let it slide. The AI will keep digging the hole deeper to stay coherent unless interrupted.

4. Iterative Rephrasing

Same question, sharper framing, until signal clears. The first answer is rarely the best answer.

Sequence example: - First ask: "What are your biggest risk factors that cause hallucinations?" - AI gives technical root causes (ambiguous queries, probabilistic predictions, etc.) - Rephrase: "I am looking to teach a human to recognize symptoms... What are the symptoms that a user needs to suppress?" - AI gives user-side list - Rephrase again: "Address YOUR tendencies, where things can go wrong if the human is not careful." - AI finally gives the honest first-person list

Three passes. Each one narrowed the target until the AI couldn't deflect.

The Meta-Lesson

The AI is not trying to deceive you. It's trying to be helpful in the way it was trained to be helpful — confident, complete, friendly. Your job is to redefine what "helpful" means for the current interaction, then enforce it.