Skip to content

Features

MIDI Sketch is a music theory-based MIDI generator that creates complete pop music arrangements.

MIDI Output, Not Audio

Unlike AI audio generators (Suno, Udio, etc.), MIDI Sketch outputs editable MIDI data.

AI Audio GeneratorsMIDI Sketch
OutputFinished audio (MP3/WAV)MIDI files
EditingLimited or noneFull control in DAW
SoundsFixedYour choice
MixingBaked inYou decide
ReproducibilityOften inconsistentDeterministic (seed-based)

What You Get

  • 9 separate tracks (vocal, aux, chord, bass, motif, guitar, arpeggio, drums, SE)
  • Each track on its own MIDI channel
  • Import directly into any DAW
  • Use your own instruments and effects

Music Theory Foundation

MIDI Sketch doesn't use machine learning or neural networks. It implements classical harmony principles combined with modern pop music analysis.

Melody Generation

Template-Driven Architecture

7 melody templates model specific vocal styles:

  • PlateauTalk: NewJeans/Billie Eilish style - high plateau with talk-sing
  • RunUpTarget: YOASOBI/Ado style - ascending runs to target notes
  • HookRepeat: TikTok/K-POP style - short repeating hooks
  • SparseAnchor: Official髭男dism style - sparse anchor notes
  • And more (DownResolve, CallResponse, JumpAccent)
Singability Constraints
  • Direction inertia: Accumulated momentum tracking prevents erratic direction changes
  • Tessitura enforcement: Real-time pitch adjustments for comfortable singing range
  • Leap compensation: Automatic stabilization steps after large intervals
  • Vowel constraints: Pitch movement limited within vowel sections for natural phrasing

Voice Leading & Chord Voicing

Three Voicing Types
  • Close voicing: Notes within one octave (warm, suitable for verses)
  • Open voicing: Drop2, Drop3, Spread variations (powerful, for choruses)
  • Rootless voicing: Root omitted when bass provides it (jazz-influenced)
Voice Leading Optimization
  • Weighted distance calculation (bass and soprano get 2x priority)
  • Common tone maximization between successive chords
  • Parallel 5ths/octaves detection with context-aware enforcement
  • Avoid note detection (minor 2nd with chord tones, tritone with root)

Non-Chord Tone (NCT) Decoration

Based on Kostka & Payne's Tonal Harmony framework:

Strong Beats and Weak Beats

In 4/4 time, strong beats (1 and 3) feel accented and stable, while weak beats (2 and 4) feel lighter. Melodies typically place chord tones on strong beats for harmonic clarity.

NCT Types
TypePlacementDescription
Passing ToneWeak beatStepwise connection between chord tones
Neighbor ToneWeak beatStep away from chord tone and return
AppoggiaturaStrong beatAccented dissonance resolving by step
AnticipationBefore beatEarly arrival of next chord tone
TensionContext-dependent9th, 11th, 13th extensions
Mood-Dependent Configuration
  • Bright/Upbeat: 75% chord tones, pentatonic focus
  • CityPop: 50% chord tones, jazz tensions enabled
  • Ballad: 65% chord tones, expressive appoggiaturas
  • Dark/Dramatic: Chromatic approach notes enabled

Harmony Context & Collision Avoidance

Multi-Track Coordination
  • Track collision detection: Registers all notes from vocal, bass, chord, aux tracks
  • Low register strictness: 3-semitone threshold below C4 to prevent muddiness
  • Safe pitch resolution: Multi-strategy fallback (chord tones → consonant intervals → range search)

Emotion Curve System

Song Emotional Arc

The Emotion Curve system plans the emotional journey of a song, assigning specific characteristics to each section:

  • Intro: Anticipation (low tension, building energy)
  • Verse (A): Expectation (moderate tension)
  • Pre-chorus (B): Tension build (high tension, upward pitch tendency)
  • Chorus: Release/resolution (peak energy, maximum density)
  • Bridge: Reflection (lower energy, contrast)
  • Outro: Closure (decreasing tension)

Each section receives emotion parameters (tension, energy, resolution need, pitch tendency, density) that guide generation across all tracks.

Euclidean Rhythms

Mathematical Rhythm Patterns

Drum patterns use Bjorklund's algorithm to distribute hits evenly across steps, creating natural-sounding rhythms found in many musical traditions:

PatternHits/StepsTraditional Name
E(3,8)[x..x..x.]Cuban tresillo
E(5,8)[x.xx.xx.]Cuban cinquillo
E(5,16)Bossa nova feel-
E(4,16)Four-on-the-floor-

These mathematically-spaced patterns feel more natural than probability-based random placement.

Secondary Dominants

Harmonic Enrichment

Secondary dominants (V/V, V/vi, etc.) are automatically inserted to create stronger harmonic pull toward target chords. This enriches chord progressions without requiring manual configuration.

Guitar Track

Accompaniment Guitar Generation

A dedicated guitar track generates accompaniment patterns influenced by Blueprint constraints such as guitar skill level and guitar-below-vocal positioning. Guitar appears on its own MIDI channel and can be enabled or disabled independently.

Energy Curve

Song Energy Progression

The Energy Curve system controls how energy progresses through the song, providing high-level control over dynamics beyond per-section settings:

  • GradualBuild: Energy increases steadily from start to finish
  • FrontLoaded: High energy at the start, tapering toward the end
  • WavePattern: Alternating high and low energy across sections
  • SteadyState: Consistent energy level throughout

Melody & Motif Overrides

Fine-Grained Parameter Control

Melody Override allows fine-grained control over melody generation parameters:

  • Max leap, syncopation probability, phrase length
  • Long note ratio, chorus register shift
  • Hook repetition, leading tone behavior

Motif Override allows fine-grained control over motif generation parameters:

  • Motif length, note count, motion (0-4)
  • Register (high/mid), rhythm density

Expanded Arpeggio Patterns

8 Arpeggio Patterns

Beyond the basic Up, Down, UpDown, and Random patterns, MIDI Sketch now includes:

  • Pinwheel: Alternating direction pattern
  • PedalRoot: Returns to root between each note
  • Alberti: Classical broken chord pattern (low-high-mid-high)
  • BrokenChord: Irregular chord tone ordering

Performance Controls

DriveFeel, Syncopation & Mora Rhythm
  • DriveFeel: Controls performance intensity from laid-back (0) to aggressive (100), affecting timing tightness and velocity emphasis
  • Syncopation: enableSyncopation toggle adds groove effects by shifting notes off the grid
  • MoraRhythmMode: Support for Japanese mora-timed rhythm, aligning note durations to syllable timing patterns

Piano Roll Safety API

Note Safety Analysis

The Piano Roll Safety API analyzes pitch safety at any point in the generated song. For each MIDI pitch (0-127), it reports:

  • Safety level: Safe (chord tone), Warning (tension/low register/passing tone), or Dissonant (non-scale/collision)
  • Reason flags: Detailed bit flags indicating why a pitch is rated at its level (e.g., ChordTone, Tension, LargeLeap, Minor2nd collision)
  • Collision detection: Identifies which tracks would collide with a given pitch
  • Recommended pitches: Up to 8 suggested pitches for the current harmonic context

Use after any generation call (generateVocal, generateFromConfig, etc.) for real-time pitch guidance in piano roll editors.

Custom Vocal API

User-Defined Melody Input

The setVocalNotes API allows injecting a custom melody (as an array of note events) instead of using the built-in melody generator. The accompaniment is then generated around the user-provided vocal, with full harmony context coordination including chord recognition, collision avoidance, and Aux track generation.

Chord Timeline API

Harmonic Context Retrieval

The getChordTimeline API returns the chord progression timeline for the generated song, including tick positions, chord degrees, and secondary dominant information. This is used for playback synchronization and harmonic analysis.

SongConfigBuilder

Fluent Configuration API

The SongConfigBuilder provides a fluent API for constructing song configurations with cascade change detection. When a parameter changes, dependent parameters are automatically recalculated, ensuring consistent configurations without manual coordination.

Academic Foundation

The implementation references:

Deterministic Generation

Same seed + same parameters = same output. Every time.

bash
# These will always produce identical MIDI files
./midisketch_cli --seed 12345 --style jpop
./midisketch_cli --seed 12345 --style jpop

Reproducibility Benefits

  • Reproducible results for iterative workflows
  • Share seeds with collaborators
  • Metadata embedded in MIDI files enables regeneration

Candidate Selection System

For melody generation, MIDI Sketch doesn't just output the first result. It generates 20-100 candidates per section and selects the best one through evaluation:

  1. Culling: Filter out melodies with issues (high register strain, monotony, scattered notes)
  2. Scoring: Rank survivors on singability, chord tone alignment, contour shape
  3. Selection: Choose the highest-scoring candidate
Candidate Counts by Section
SectionCandidates
Chorus100
Pre-chorus (B)50
Bridge / Chant30
Verse / Intro / Outro20

More candidates for important sections where melody quality matters most.

Style Presets

17 style presets (stylePresetId 0-16) determine the overall musical character, each mapped to one of 24 internal moods. Moods (0-23) can also be set explicitly via moodExplicit, covering:

  • J-Pop / K-Pop / City Pop
  • EDM / Electro Pop / Synthwave / Future Bass
  • Ballad / R&B / R&B Neo Soul / Chill / Lofi
  • Rock / Light Rock
  • Anime / Vocaloid
  • Latin Pop / Trap
  • And more
What Each Preset Configures
  • BPM range
  • Drum patterns
  • Chord voicing style
  • Melody template preferences
  • Evaluation weights
  • Mood-dependent chord extension probabilities

14 vocal style presets are available (Auto, Standard, Vocaloid, UltraVocaloid, Idol, Ballad, Rock, CityPop, Anime, BrightKira, CoolSynth, CuteAffected, PowerfulShout, KPop) to fine-tune melody generation characteristics independently from mood.

Multiple Composition Styles

Three composition paradigms:

StyleVocalAuxMotifArpeggioUse Case
MelodyLead (0)YesYesBlueprint-dependentOptionalSongs with vocals
BackgroundMotif (1)NoYesYesOptionalBGM, lo-fi
SynthDriven (2)NoNoBlueprint-dependentOptional (manual enable)Electronic, EDM

BGM-Only Modes

BackgroundMotif disables Vocal but keeps Aux enabled and forces Motif generation. SynthDriven disables both Vocal and Aux; Arpeggio must be manually enabled with arpeggioEnabled=true.

Vocal-First Workflow

For MelodyLead style, iterate on the melody before generating accompaniment:

Iterate Until Satisfied

Use generateVocal() to create the initial melody, then call regenerateVocal() with a new seed or VocalConfig to try variations. Once satisfied, call generateAccompaniment() to add the backing tracks. Alternatively, use generateWithVocal() for vocal-priority one-shot generation.

Lightweight & Portable

  • ~555KB WASM (gzip: ~225KB) + ~80KB JS
  • No external dependencies (pure C++17)
  • Runs in browser, Node.js, or native CLI
  • No API calls, no internet required

Open Source

License

Apache 2.0 licensed - use generated MIDI commercially, modify and redistribute freely.


Use Cases

Demo Production

Generate quick song sketches to test ideas before investing time in full production.

Learning Tool

Study how chord progressions, voice leading, and arrangement work by examining the output.

DAW Templates

Generate starting points for tracks, then customize with your own sounds and mixing.

Game/Video BGM

Create reproducible background music with deterministic seeds.

Songwriting Aid

Get melody ideas and chord progressions to build upon.


What MIDI Sketch Is Not

Important Distinctions

  • Not an AI audio generator - It outputs MIDI, not audio
  • Not a replacement for composers - It's a tool to generate starting points
  • Not machine learning - It uses explicit music theory rules
  • Not cloud-based - Everything runs locally

Released under the MIT License.