How to Write AI Music Prompts (Step-by-Step)

Writing a good AI music prompt is less like writing code and more like briefing a session musician who has heard everything but experienced nothing. You can't say "like that song" — you have to describe the sound, feel, and structure directly.

The good news is that the skill is learnable in under an hour. There are five decisions that account for most of the variance in output quality, and once you understand why each one matters, your first-take quality jumps immediately. This guide walks through all five, in order, with examples at each step.

Step 1: Lock the sub-genre first

Your sub-genre is the most powerful single variable in the prompt. It carries enormous implicit information about tempo range, instrumentation palette, production era, and energy type — information you don't have to spell out if you name the right sub-genre.

"Trap" implies 130-160 BPM, 808 bass, hi-hat patterns and dark tonality without you saying any of it. "Nordic folk metal" implies acoustic instruments layered with distorted guitars, complex time signatures, and an epic, wide production. Start here before anything else.

The principle: go as narrow as you can. If you mean "psychedelic 70s prog rock" say that, not just "rock." If you mean "UK drill with melodic hooks" say that, not "hip-hop."

Step 2: Add BPM and energy

BPM grounds the model on the temporal feel of the track. Energy words fill in what BPM doesn't capture — the difference between a tight, punchy 120 BPM and a relaxed, swinging 120 BPM.

Both Suno and Udio respect numeric BPM in the prompt. You don't need to be precise to the single digit — a range ("around 90 BPM" or "88-96 BPM") works fine and gives the model room to find a natural feel inside that range.

Pair BPM with one or two energy descriptors: driving, laid-back, frenetic, sparse, hypnotic, anthemic, brooding, jubilant. These modulate how the arrangement fills the space at that tempo.

Step 3: Specify instrumentation and texture

Instrumentation is where most prompts go vague and most output goes generic. Name the specific instruments, not the category.

"Guitars" → "fingerpicked acoustic guitar, palm-muted electric rhythm, clean single-note lead" "Keyboards" → "vintage Moog bass synth, Rhodes piano, organ pads"

Strings" → "cello section, light viola counterpoint, no violins

Add texture descriptors after instruments: warm, bright, dark, wet (reverb-heavy), dry, spacious, dense, lo-fi, crispy, round. These describe the production character as much as the notes.

Name the specific instrument, not the family: "upright bass" not "bass"
Describe the playing technique when it matters: "brushed snare" vs "rimshot snare"
Describe production character: "tape saturation," "vinyl crackle," "stadium reverb"
Describe what's absent: "no kick drum," "no electric guitar," "no vocals" forces restraint

Step 4: Build in structure with section tags

Structure tags are the single most underused feature in AI music prompting. On Suno v4, you can include [Verse], [Pre-Chorus], [Chorus], [Bridge], [Outro], [Instrumental Break], [Intro], and [Hook] directly in the lyrics field and the model will respect them as structural anchors.

On Udio, the same tags work in the prompt or lyrics box. Without these tags, the model free-forms — and free-formed AI music tends to meander rather than build and resolve.

A minimal structure that works: [Intro] [Verse] [Chorus] [Verse] [Chorus] [Bridge] [Chorus] [Outro]. Add a note after each tag for arrangement direction: "[Chorus] Full arrangement, brass swell, vocal harmony stack."

Step 5: Troubleshoot before you re-roll everything

When a take doesn't work, diagnose before you change the whole prompt. Common problems and targeted fixes:

Tempo too fast/slow → add or adjust BPM number only
Wrong instrumentation → swap instrument descriptors, keep everything else
Vocal style wrong → add register, timbre, and style words to vocal descriptor
Song has no structure → check that [Verse]/[Chorus] tags are in the lyrics field, not style field
Output sounds generic → your sub-genre is probably too broad; add a regional/era modifier
Too much reverb/effect → add "dry," "intimate," or "close-mic'd" to texture descriptors

Recommended tools

Affiliate links — we may earn a commission at no cost to you.

★ Top pick

Suno

Best all-round vocal + full-song generation (v4).

Try Suno →

Udio

Highest audio fidelity, rich style controls, stem support.

Try Udio →

Get the 50 best Suno & Udio prompts

Free PDF — the prompt recipes our desk actually uses. One email a week.

Frequently asked

What's the most common prompt mistake?

Being vague on sub-genre and instrumentation. "Chill hip-hop" is a category; "lo-fi boom-bap, 84 BPM, chopped jazz sample, warm Rhodes, brushed snare" is a prompt. The second one produces a usable track most of the time.

Should I describe mood or instrumentation first?

Instrumentation first, always. Mood words like "sad" are vague; instrumentation implies mood automatically. A minor-key piano at 60 BPM with a cello is sad without you saying it.

How do I get the AI to avoid certain sounds?

State what you don't want explicitly: "no electric guitar," "no drums," "no vocals," "no string section." Both Suno and Udio handle exclusion prompts reasonably well.

Is it better to write a prompt or paste actual lyrics?

Both. Paste lyrics for precise vocal content; use the style prompt for everything else. Don't try to do both in one field — keep the lyric content in the lyrics field and the sonic brief in the style/prompt field.

Does prompt length affect quality?

Not directly. A 15-word precision prompt can outperform a 100-word ramble. What matters is that every word is doing work — if you can remove a word without losing information, remove it.

How to Write AI Music Prompts That Actually Work