MIDI Sketch is a music theory-based MIDI generator that creates complete pop music arrangements. Unlike AI audio generators like Suno, it outputs editable MIDI data - drums, bass, chords, arpeggios, and melodies - that you can import into your DAW and customize with your own sounds and mixing.

How is MIDI Sketch different from Suno or other AI music generators?

AI audio generators like Suno produce finished audio that you cannot edit. MIDI Sketch generates MIDI data based on music theory, giving you full control. You can change instruments, edit notes, adjust timing, and mix however you want in your DAW. Same seed always produces the same output - fully reproducible.

How do I use MIDI Sketch with my DAW?

Generate your song in the browser, then download the MIDI file. Import it directly into any DAW like Ableton Live, FL Studio, Logic Pro, or Cubase. All tracks are separated and ready for your own sounds and mixing.

What music styles does MIDI Sketch support?

MIDI Sketch includes presets for J-Pop, K-Pop, City Pop, EDM, Ballad, Rock, R&B, and more. Each preset configures appropriate rhythms, chord voicings, and arrangement patterns based on music theory for that style.

How are chord progressions generated?

MIDI Sketch auto-generates chord progressions based on popular patterns in each genre. Select a style preset and the system creates appropriate progressions with jazz voicings, tensions, and extensions - all grounded in music theory.

Is MIDI Sketch output reproducible?

Yes, MIDI Sketch is fully deterministic. The same seed and settings always produce exactly the same MIDI output. This makes it perfect for iterative production workflows where you need consistent, predictable results.

How is the melody generated?

MIDI Sketch generates melodies based on music theory - following chord tones, using appropriate scales, and creating natural phrase structures. You can configure vocal ranges and regenerate with different seeds until you find a melody you like.

What tracks are included in the MIDI output?

A complete MIDI Sketch export includes: drums (kick, snare, hi-hat, fills), bass line, chord pads, optional arpeggio, and vocal melody. Each track is on a separate MIDI channel for easy DAW editing.

Can I use MIDI Sketch for commercial music production?

Yes, MIDI Sketch is MIT licensed. All generated MIDI files are yours to use commercially. Use them as starting points for your productions, demos, or final releases.

Why use MIDI instead of AI-generated audio?

MIDI gives you complete creative control. You can change every note, swap instruments, adjust velocities, quantize timing, and mix with your own effects. AI audio is a black box - MIDI is transparent and editable. For serious music production, MIDI is the professional choice.

Architecture Overview

This document explains the internal architecture of MIDI Sketch.

Project Structure

midi-sketch/
├── src/
│   ├── core/              # Core engine (~16000 lines, 46 headers)
│   │   ├── generator.h/cpp        # Central orchestrator
│   │   ├── harmony_context.h      # Inter-track collision detection facade
│   │   ├── chord_progression_tracker.h/cpp
│   │   ├── track_collision_detector.h/cpp
│   │   ├── safe_pitch_resolver.h/cpp
│   │   ├── melody_evaluator.h/cpp # Candidate scoring system
│   │   ├── melody_templates.h/cpp # 7 melody template definitions
│   │   ├── melody_embellishment.h/cpp # NCT insertion system
│   │   ├── pitch_utils.h/cpp      # Pitch operations
│   │   ├── chord_utils.h/cpp      # Chord operations
│   │   ├── piano_roll_safety.h/cpp
│   │   ├── modulation_calculator.h/cpp
│   │   ├── preset_data.h/cpp      # Style presets
│   │   └── ...                    # Types, utilities, etc.
│   ├── track/             # Track generators (~13000 lines, 14 headers)
│   │   ├── melody_designer.h/cpp  # Template-driven melody
│   │   ├── vocal.h/cpp            # Vocal coordination
│   │   ├── aux_track.h/cpp        # Aux sub-melody
│   │   ├── chord_track.h/cpp      # Chord voicing
│   │   ├── bass.h/cpp             # Bass patterns
│   │   ├── drums.h/cpp            # Drum patterns
│   │   ├── motif.h/cpp            # Background motif
│   │   ├── guitar.h/cpp           # Accompaniment guitar
│   │   ├── arpeggio.h/cpp         # Arpeggio patterns
│   │   └── se.h/cpp               # Section markers
│   ├── midi/              # MIDI output (8 headers)
│   ├── analysis/          # Dissonance analysis
│   ├── midisketch.h       # Public C++ API
│   └── midisketch_c.h     # C API (WASM interface)
├── tests/                 # Google Test suite (63 test files)
├── dist/                  # WASM distribution
└── demo/                  # Browser demo

Core Components

MidiSketch Class

The main entry point providing a high-level API:

Two Generation Workflows

Vocal-First: Use generateVocal() → iterate with regenerateVocal() → finalize with generateAccompaniment()
Standard: Use generate() or generateFromConfig() for one-shot generation

Configurations can be constructed using the SongConfigBuilder, a fluent API with cascade change detection that automatically recalculates dependent parameters when upstream values change.

cpp

class MidiSketch {
  void generate(const GeneratorParams& params);
  void generateFromConfig(const SongConfig& config);
  void generateWithVocal(const SongConfig& config);   // Vocal-priority full generation
  void generateVocal(const SongConfig& config);
  void regenerateVocal(const VocalConfig& config);
  void generateAccompaniment(const AccompanimentConfig& config);
  void regenerateAccompaniment(uint32_t seed);
  void setVocalNotes(const SongConfig& config, const NoteInput* notes, size_t count);

  std::vector<uint8_t> getMidi() const;
  std::string getEventsJson() const;
  std::string getChordTimeline() const;               // Chord progression timeline
  const Song& getSong() const;
};

Generator

The central orchestrator (src/core/generator.h) that coordinates all track generation:

cpp

class Generator {
  Song generate(const GeneratorParams& params);
private:
  void buildStructure();
  void generateVocal();
  void generateAux();
  void generateMotif();
  void generateBass();
  void generateChord();
  void generateGuitar();      // Accompaniment guitar generation
  void generateArpeggio();
  void generateDrums();
  void generateSE();          // Section markers / sound effects
  void applyTransitionDynamics();
  void applyHumanization();
};

Song Container

Holds all generated data (9 tracks):

cpp

struct Song {
  Arrangement arrangement;     // Section layout
  MidiTrack vocal;            // Channel 0 - Main melody
  MidiTrack aux;              // Channel 1 - Sub-melody
  MidiTrack chord;            // Channel 2 - Harmony
  MidiTrack bass;             // Channel 3 - Foundation
  MidiTrack motif;            // Channel 4 - BackgroundMotif style
  MidiTrack guitar;           // Channel 6 - Accompaniment guitar
  MidiTrack arpeggio;         // Channel 5 - SynthDriven style
  MidiTrack drums;            // Channel 9 - Rhythm
  MidiTrack se;               // Channel 15 (markers)
};

Channel Sharing

Aux and Arpeggio share MIDI channel 5. In MelodyLead style, Aux is generated; in SynthDriven style, Arpeggio is generated instead. They are never active simultaneously.

Data Flow

Standard Generation (Traditional paradigm)

Generation Order by Paradigm

The track generation order varies depending on the Blueprint paradigm:

Traditional / MelodyDriven: Vocal -> Aux -> Motif -> Bass -> Chord -> Guitar -> Arpeggio -> Drums -> SE
RhythmSync: Motif -> Vocal -> Aux -> Bass -> Chord -> Guitar -> Arpeggio -> Drums -> SE

Vocal-First Generation

Time Representation

MIDI Sketch uses tick-based timing throughout:

cpp

using Tick = uint32_t;
constexpr Tick TICKS_PER_BEAT = 480;    // Standard MIDI resolution
constexpr Tick TICKS_PER_BAR = 1920;    // 4/4 time signature
constexpr uint8_t BEATS_PER_BAR = 4;

Tick Calculation

Quarter note = 480 ticks
Eighth note = 240 ticks
Sixteenth note = 120 ticks
One bar (4/4) = 1920 ticks

Note Representation

Two-layer note representation:

cpp

// Intermediate musical representation (internal)
struct NoteEvent {
  Tick startTick;      // Absolute start time
  Tick duration;       // Duration in ticks
  uint8_t note;        // MIDI note (0-127)
  uint8_t velocity;    // MIDI velocity (0-127)
};

// Low-level MIDI bytes (output only)
struct MidiEvent {
  Tick tick;           // Absolute time
  uint8_t status;      // MIDI status byte
  uint8_t data1;       // First data byte
  uint8_t data2;       // Second data byte
};

Section Definition

Songs are divided into sections:

cpp

struct Section {
  SectionType type;              // Intro, A, B, Chorus, Bridge, Interlude, Outro
  std::string name;              // Display name
  uint8_t bars;                  // Bar count
  Tick startBar;                 // Start position (bars)
  Tick start_tick;               // Start position (ticks)
  VocalDensity vocal_density;    // Full, Sparse, None
  BackingDensity backing_density; // Normal, Thin, Thick
};

Composition Styles

Three composition styles affect the generation approach:

Style	Vocal	Aux	Motif	Arpeggio	Description
MelodyLead (0)	Yes	Yes	Blueprint-dependent	Optional	Traditional arrangement with prominent vocal melody
BackgroundMotif (1)	No	Yes	Yes	Optional	Vocal disabled, Aux enabled, Motif as primary focus
SynthDriven (2)	No	No	Blueprint-dependent	Optional (manual enable)	Vocal/Aux disabled, synth/arpeggio-forward electronic style

BGM-Only Modes

BackgroundMotif disables Vocal but keeps Aux enabled and forces Motif generation. SynthDriven disables both Vocal and Aux; Arpeggio must be manually enabled with arpeggioEnabled=true. Use MelodyLead for songs with vocals.

Production Blueprints

Blueprints are high-level production templates that control track generation order, motif behavior, and implicit overrides. There are 10 blueprints (ID 0-9), plus ID 255 for random selection.

ID	Name	Paradigm	RiffPolicy	Drums Required	Weight
0	Traditional	Traditional	Free	-	42%
1	RhythmLock	RhythmSync	Locked	Yes	14%
2	StoryPop	MelodyDriven	Evolving	-	10%
3	Ballad	MelodyDriven	Free	-	4%
4	IdolStandard	MelodyDriven	Evolving	-	10%
5	IdolHyper	RhythmSync	Locked	Yes	6%
6	IdolKawaii	MelodyDriven	Locked	Yes	5%
7	IdolCoolPop	RhythmSync	Locked	Yes	5%
8	IdolEmo	MelodyDriven	Locked	-	4%
9	BehavioralLoop	Traditional	LockedPitch	-	0%*

* BehavioralLoop (ID 9) has weight 0% and must be explicitly selected (never chosen randomly). It forces addictive_mode=true, RiffPolicy::LockedPitch, and HookIntensity::Maximum.

Paradigms

Traditional: Vocal -> Aux -> Motif -> Bass -> Chord -> Guitar -> Arpeggio -> Drums -> SE
RhythmSync: Motif -> Vocal -> Aux -> Bass -> Chord -> Guitar -> Arpeggio -> Drums -> SE (Motif as coordinate axis)
MelodyDriven: Vocal -> Aux -> Motif -> Bass -> Chord -> Guitar -> Arpeggio -> Drums -> SE (same order as Traditional but Motif follows melody)

RiffPolicy

The API exposes three RiffPolicy values:

Free (0): Motif varies per section (MotifRepeatScope controls cross-section behavior)
Locked (1): Pitch contour fixed, expression varies (internally LockedContour)
Evolving (2): 30% chance of change every 2 sections

Internally, Blueprints use a finer-grained set: Free(0), LockedContour(1), LockedPitch(2), LockedAll(3), Evolving(4).

Blueprint Overrides

Blueprints can override several SongConfig parameters:

section_flow overrides formId (when present and formExplicit=false)
riff_policy overrides motifRepeatScope (only when Free)
drums_required forces drums_enabled=true (unless drumsEnabledExplicit=true and drumsEnabled=false)
drums_sync_vocal overrides the SongConfig setting
mood_mask restricts compatible moods (check with isMoodCompatible())

Parameter Application Order

Parameters are applied in a specific cascade order, where later stages can override earlier ones:

StylePreset → VocalStylePreset → MelodicComplexity → SongConfig Overrides → Master Switch

StylePreset: Sets base parameters including melody configuration
VocalStylePreset: Adjusts max_leap, syncopation, density, and other vocal characteristics
MelodicComplexity: Applies density/leap multipliers (Simple reduces, Complex amplifies)
SongConfig Overrides: User-specified melody/motif override parameters take highest priority
Master Switch: enableSyncopation=false forces syncopation_prob=0.0 and allow_bar_crossing=false

Random Number Generation

Deterministic generation using Mersenne Twister:

cpp

std::mt19937 rng(seed);  // Same seed = same output

Reproducibility

seed > 0: Fully deterministic - same seed with same parameters always produces identical output
seed = 0: Random - uses current clock time, different each run

When seed is 0, current clock time is used for randomization.

WASM Compilation

The library compiles to WebAssembly via Emscripten:

Output: ~555KB WASM (gzip: ~225KB) + ~80KB JS (wrapper + glue)
No external dependencies: Pure C++17
ES6 module: Modular JavaScript wrapper

bash

# Build flags
-sWASM=1 -sMODULARIZE=1 -sEXPORT_ES6=1
-sALLOW_MEMORY_GROWTH=1 -sSTACK_SIZE=1048576

C API Layer

For WASM interop, a C API wraps the C++ classes:

// Lifecycle
MidiSketchHandle handle = midisketch_create();
midisketch_generate(handle, params);
MidiSketchMidiData* midi = midisketch_get_midi(handle);
midisketch_free_midi(midi);
midisketch_destroy(handle);

Key functions:

midisketch_generate() - Core generation
midisketch_generate_vocal_from_json() - Vocal-only generation
midisketch_regenerate_vocal_from_json() - Vocal regeneration
midisketch_generate_accompaniment_from_json() - Accompaniment generation
midisketch_regenerate_accompaniment_from_json() - Accompaniment regeneration
midisketch_generate_with_vocal_from_json() - Vocal-priority full generation
midisketch_set_vocal_notes_from_json() - Custom vocal injection
midisketch_get_piano_roll_safety() - Piano roll safety analysis
midisketch_get_chord_timeline() - Chord timeline retrieval
midisketch_get_midi() - MIDI binary output
midisketch_get_events() - JSON event data
midisketch_get_info() - Metadata (bars, ticks, BPM)
midisketch_blueprint_count() / midisketch_blueprint_name() - Blueprint information

Architecture Overview ​

Project Structure ​

Core Components ​

MidiSketch Class ​

Generator ​

Song Container ​

Data Flow ​

Standard Generation (Traditional paradigm) ​

Vocal-First Generation ​

Time Representation ​

Note Representation ​

Section Definition ​

Composition Styles ​

Production Blueprints ​

Parameter Application Order ​

Random Number Generation ​

WASM Compilation ​

C API Layer ​