Skip to content

Track Generators

This document details each track generator in MIDI Sketch.

Track Overview

MIDI Sketch generates 9 tracks across different MIDI channels:

Channel Assignment

TrackChannelProgramRole
Vocal0Piano (0)Main melody
Aux1E.Piano (4)Sub-melody support
Chord2E.Piano (4)Harmonic backing
Bass3E.Bass (33)Harmonic foundation
Motif4Synth (81)BackgroundMotif style
Arpeggio5Synth (81)SynthDriven style
Guitar6Acoustic Guitar (25)Accompaniment guitar
Drums9GM DrumsRhythm
SE15-Section markers

Vocal Track

Source: src/track/vocal.cpp (~314 lines), src/track/melody_designer.cpp (~2048 lines)

The vocal system uses a template-driven melody designer with style-aware evaluation for predictable, stylistically-accurate melody generation.

Why "Vocal" Track?

The Vocal track generates the main melody line. It's called "Vocal" because it's designed to be sung or played as the lead part. Use MIDI channel 0 (piano) in your DAW to preview it, or assign any instrument you prefer.

Architecture

The vocal generation consists of three major components:

  1. MelodyDesigner (melody_designer.cpp) - Template-driven pitch selection with evaluation
  2. Vocal Generator (vocal.cpp) - Section structure, caching, and coordination
  3. VocalStyleProfile - Unified bias and evaluation configuration per style

Melody Templates

7 melody templates define melodic characteristics:

IDNamePlateauMax StepUse Case
0Auto--VocalStyle-based selection
1PlateauTalk0.652NewJeans, Billie Eilish style
2RunUpTarget0.204YOASOBI, Ado style
3DownResolve0.303B-section, pre-chorus
4HookRepeat0.403TikTok, K-POP hooks
5SparseAnchor0.502Official髭男dism, ballad
6CallResponse--Duet patterns
7JumpAccent--Emotional peaks
  • Plateau ratio: Probability of staying on the same pitch (higher = more repetitive)
  • Max step: Maximum interval in semitones (lower = smoother)

Generation Flow

Pitch Selection (4 Choices Only)

The MelodyDesigner limits pitch selection to 4 options:

cpp
enum class PitchChoice {
    Same,       // Stay on current pitch (plateau_ratio)
    StepUp,     // +1 semitone
    StepDown,   // -1 semitone
    TargetStep  // ±2 toward target (if template has target)
};

This constrained approach produces more natural, singable melodies.

Vocal Attitudes

AttitudeDescriptionImplementation
CleanConservative, singableChord tones only, on-beat
ExpressiveEmotional, dynamicTensions allowed, timing variance
RawEdgy, unconventionalNon-chord tones, boundary breaking

Phrase Caching

Phrases are cached using a composite key (V2 cache) to ensure musical coherence:

cpp
struct PhraseCacheKey {
    SectionType type;      // Verse, Chorus, etc.
    uint8_t bars;          // Section length
    uint8_t chord_degree;  // Starting chord degree
};

// Cache behavior:
// - 80% exact reuse: Same phrase reproduced
// - 20% variation: Applied transformations (octave shift, rhythm variation)

Phrase Variation

When reusing cached phrases, the system may apply variations:

  • Octave shift: Move phrase up/down an octave
  • Rhythm variation: Slight timing adjustments
  • Contour inversion: Flip ascending/descending patterns

Range Constraints

cpp
struct VocalRange {
    uint8_t low = 60;   // C4
    uint8_t high = 79;  // G5
};

Non-Chord Tone Decoration

The vocal track uses non-chord tones (NCT) to add melodic interest beyond simple chord-tone melodies:

Strong Beats and Weak Beats

In 4/4 time, strong beats are beats 1 and 3 (where you naturally tap your foot), while weak beats are beats 2 and 4. Chord tones on strong beats create stability; non-chord tones on weak beats add movement without disrupting the harmony.

NCT TypeDescriptionPlacement
ChordToneNotes belonging to the current chord (baseline)Strong beats
PassingToneStepwise connection between two chord tonesWeak beats
NeighborToneStep away from a chord tone and returnWeak beats
AppoggiaturaAccented dissonance that resolves by stepStrong beats
AnticipationEarly arrival of the next chord's toneBefore chord change
TensionExtended chord tones (9th, 11th, 13th)Based on style

Configuration varies by mood:

  • Bright: More chord tones, less dissonance
  • Jazzy: More tensions, syncopation
  • Ballad: Balanced with expressive appoggiaturas
  • J-POP: Prefers pentatonic scale (yonanuki) intervals

VocalStyleProfile

Each vocal style has a unified profile that controls both generation bias and evaluation weights:

ProfilePlateau BiasHigh RegisterSingabilitySurprise
Standard1.01.00.250.15
Idol1.21.00.300.12
Rock0.81.20.200.20
Ballad1.10.90.400.10
Anime0.91.30.250.25
Vocaloid0.61.10.100.25
KPop (13)1.01.20.250.20

UltraVocaloid Mode

Enhanced Vocaloid-style generation with:

  • Machine-gun rhythm: Rapid-fire 16th note sequences characteristic of Vocaloid songs
  • Breathing points: Automatic insertion of micro-pauses for phrasing even in dense passages
  • Per-section rhythm lock: Each section maintains consistent rhythmic identity
Profile Parameters
  • Plateau Bias: Preference for staying on the same pitch (higher = more repetitive)
  • High Register: Preference for high notes (higher = brighter)
  • Singability: Weight for human-singable melodies (higher = easier to sing)
  • Surprise: Weight for unexpected melodic turns (higher = more dynamic)

Melody Evaluation System

The MelodyDesigner generates multiple candidate melodies and evaluates them:

Evaluation Components:

ComponentWeightCriteria
Style Score40%Contour matching, pattern consistency, surprise balance
Singability Score40%Stepwise motion, breath marks, monotony avoidance
Bias Score20%Interval distribution matching style preferences

Hook System

Chorus sections use a dedicated hook generation system with 6 rhythm patterns:

PatternRhythmCharacter
Buildup8-8-4Classic step resolution
Syncopated4-8-8Syncopated start
FourNote8-8-8-4High energy
Powerful4-4Simple, strong
Dotted8-4-8Dotted rhythm feel
CallResponse4-8-8-8Call and response

Hook Skeletons:

SkeletonDescription
RepeatSame pitch repeated
AscendingRising contour
AscendDropRise then fall
LeapReturnJump and return
RhythmRepeatPitch varies, rhythm constant

Hook Intensity controls hook prominence:

  • Off (0): No hook repetition
  • Light (1): Subtle hook presence
  • Normal (2): Standard pop hooks
  • Strong (3): Heavy hook emphasis (TikTok-style)

Global Motif System

The vocal track extracts a global motif from the first generated phrase and uses it to maintain musical coherence:

cpp
struct GlobalMotif {
    vector<int8_t> interval_signature;  // Relative pitch changes (max 8)
    vector<float> rhythm_signature;     // Relative duration ratios
    ContourType contour_type;           // Ascending, Descending, Peak, Valley, Plateau
};

Evaluation Bonus:

  • Matching contour type: +5% score
  • Similar interval patterns: +5% score (3+ matches)
  • This ensures later sections feel related to the opening

Piano Roll Safety API

Source: src/core/piano_roll_safety.cpp

The Piano Roll Safety API helps external tools (like piano roll editors) determine safe pitch placements:

cpp
enum class CollisionType : uint8_t {
    None,    // No collision - safe to place
    Mild,    // Tritone (context-dependent)
    Severe   // Minor 2nd / Major 7th (always dissonant)
};

Collision Detection:

IntervalTypeRisk
Minor 2nd (1 semitone)SevereAlways avoid
Major 7th (11 semitones)SevereAlways avoid
Tritone (6 semitones)MildContext-dependent
OthersNoneGenerally safe

Modulation Awareness

The API accounts for key modulation. When modulation is enabled, the effective_vocal_high is reduced to prevent the final chorus from exceeding the vocal range after transposition.


Aux Track

Source: src/track/aux_track.cpp (~1170 lines)

The Aux (auxiliary) track provides sub-melody support for the main vocal. It's not a counter-melody, but a "perceptual control layer" that enhances the main melody.

Purpose

RoleDescription
AddictivenessPulse loops create repetitive, catchy patterns
PhysicalityGroove accents add body movement feel
StabilityPhrase tails provide resolution
StructureHelps listeners perceive section boundaries

Aux Functions

9 auxiliary functions are available:

IDFunctionDescription
0PulseLoopRepetitive same-pitch or fixed-interval patterns
1TargetHintHints at vocal target with chord tones
2GrooveAccentRhythmic accents with staccato
3PhraseTailEnd-of-phrase descending resolution
4EmotionalPadLong sustained chord tones
5UnisonVocal unison doubling
6MelodicHookMelodic hook riff
7MotifCounterCounter melody (contrary motion)
8SustainPadWhole-note chord tone pad

Template → Aux Mapping

Each melody template automatically selects appropriate aux functions:

TemplateAux FunctionsReason
PlateauTalkA (PulseLoop)Ice Cream / minimal style
RunUpTargetB + DYOASOBI ascending then resolving
HookRepeatA + CTikTok repetitive hooks
SparseAnchorE + DBallad emotional support

Generation Constraints

  • Always generated after vocal (to avoid collisions)
  • Narrower range than vocal (50-70% of vocal range)
  • Lower velocity (0.5-0.8× vocal velocity)
  • Uses HarmonyContext to avoid dissonance with vocal

Chorus Behavior

In chorus sections, Aux track adapts its behavior:

  • Reduced density: Aux takes a backseat to let vocal shine
  • Lower register: Moves to lower range to avoid vocal collision
  • Simplified patterns: Uses more sustained notes, less busy patterns
  • Phrase endings: Respects phrase boundaries with proper resolution

Chord Track

Source: src/track/chord_track.cpp (~2000 lines)

Generates harmonic voicings with voice leading optimization.

Voicing Types

Voice Leading Algorithm

cpp
int voiceLeadingDistance(Voicing& prev, Voicing& next) {
    int distance = 0;
    for (int i = 0; i < 4; i++) {
        distance += abs(prev.notes[i] - next.notes[i]);
    }
    return distance;
}

// Select voicing that minimizes distance
Voicing selectBestVoicing(Voicing& prev, vector<Voicing>& candidates) {
    return min_element(candidates, [&](auto& a, auto& b) {
        return voiceLeadingDistance(prev, a) < voiceLeadingDistance(prev, b);
    });
}

Bass Coordination

Uses BassAnalysis to avoid doubling:

cpp
if (bassAnalysis.hasRootOnBeat1) {
    // Use rootless voicing - bass provides root
    voicing = generateRootlessVoicing(chord);
} else {
    // Include root in chord voicing
    voicing = generateFullVoicing(chord);
}

Register Constraints

cpp
constexpr uint8_t CHORD_LOW = 48;   // C3
constexpr uint8_t CHORD_HIGH = 84;  // C6

Guitar Track

Source: src/track/guitar.cpp

The Guitar track generates accompaniment guitar patterns on a dedicated MIDI channel (Ch 6). It provides rhythmic and harmonic support that complements the chord track.

Parameters

ParameterDefault (JS)Default (C++)Description
guitarEnabledfalsetrueEnable/disable guitar track generation

Blueprint Constraints

Guitar generation is influenced by Blueprint constraints:

ConstraintDescription
guitar_skillSkill level (Beginner/Intermediate/Advanced/Virtuoso) affecting pattern complexity and voicing sophistication
guitar_below_vocalWhen enabled, keeps guitar voicings below the vocal register (vocal_low - 2 semitones) to avoid masking the melody
guitar_style_hintPer-section style hint (0-7) defined in the Blueprint's SectionSlot. 0 = auto-select based on mood and energy

Generation

  • Guitar is generated after the chord track, allowing it to complement existing harmonic voicing
  • Patterns adapt to section energy and mood
  • Per-section guitar_style_hint (0-7) in the Blueprint's SectionSlot can influence the style of guitar accompaniment
  • Guitar appears on MIDI channel 6 with Acoustic Guitar (program 25) by default

Bass Track

Source: src/track/bass.cpp (~1170 lines)

Generates the harmonic foundation with root-focused patterns.

Pattern Types

The bass system supports 17+ BassPattern types. The active pattern is selected automatically based on mood and section, or influenced per-section via bass_style_hint in the Blueprint's SectionSlot (0=auto, 1-17 maps to BassPattern+1). Common pattern categories:

PatternDescriptionRhythm
SparseMinimal, ballad-styleBeat 1 only
StandardPop/rock baselineBeats 1, 3 with fills
DrivingEnergetic, forwardEighth notes throughout

Generation Logic

Approach Notes

Beat 4 may use chromatic approach to next root:

cpp
// If next chord root is C
// Beat 4 could be B (half step below) or Db (half step above)
uint8_t approachNote = nextRoot - 1; // chromatic approach

Drums Track

Source: src/track/drums.cpp (~880 lines)

Generates drum patterns with fills and dynamics.

GM Drum Map

cpp
constexpr uint8_t KICK = 36;
constexpr uint8_t SNARE = 38;
constexpr uint8_t SIDE_STICK = 37;
constexpr uint8_t CLOSED_HH = 42;
constexpr uint8_t OPEN_HH = 46;
constexpr uint8_t RIDE = 51;
constexpr uint8_t CRASH = 49;
constexpr uint8_t TOM_HIGH = 50;
constexpr uint8_t TOM_MID = 47;
constexpr uint8_t TOM_LOW = 45;

Pattern Styles

Fill Types

cpp
enum class FillType {
    TomDescend,    // High → Mid → Low tom
    TomAscend,     // Low → Mid → High tom
    SnareRoll,     // Rapid snare hits
    Combo          // Mixed elements
};

Fills are inserted at:

  • Section transitions
  • Every 4 or 8 bars
  • Before chorus

Euclidean Drums

Blueprints can specify euclidean_drums_percent to control the probability of using Euclidean rhythm patterns, which distribute hits as evenly as possible across a given number of steps.

Drum Role

Per-section drum_role in the Blueprint's SectionSlot controls drum behavior:

RoleDescription
FullStandard full drum kit
AmbientSubdued, atmospheric
MinimalSparse, minimal patterns
FXOnlySound effects only, no standard kit

Ghost Notes

Velocity-reduced snare articulations for groove:

cpp
// Main snare: velocity 100
// Ghost note: velocity 40-60

Ghost note density adapts to mood:

  • Energetic moods (BrightUpbeat, IdolPop): Higher density for livelier feel
  • Calm moods (Ballad, Chill): Sparse ghost notes

Swing Timing

Continuous swing control varies by section type and progress:

cpp
float calculateSwingAmount(SectionType section, int bar_in_section, int total_bars);
// Returns 0.0 (straight) to 0.7 (heavy swing)
SectionBase SwingBehavior
VerseLowBuilds gradually
ChorusMediumConsistent groove
BridgeVariableContext-dependent

Swing is applied to off-beat notes (8th and 16th subdivisions).

Triplet Grids

Drum patterns support triplet subdivisions for shuffle and swing feels:

  • Straight: Standard 8th/16th note grid
  • Triplet: 12/24 subdivisions per beat
  • Hybrid: Mix of straight and triplet patterns

Humanization

Subtle timing and velocity variations make patterns feel less mechanical:

  • Timing jitter: ±5-15 ticks from grid
  • Velocity variation: ±5-10 from base velocity
  • Hi-hat accent patterns: Natural emphasis on downbeats

Vocal Synchronization

When drums_sync_vocal is enabled, kick drums align with vocal onset positions:

cpp
void generateDrumsTrackWithVocal(
    MidiTrack& track,
    const Song& song,
    const GeneratorParams& params,
    std::mt19937& rng,
    const VocalAnalysis& vocal_analysis  // Pre-analyzed vocal data
);

This "rhythm lock" effect makes the groove follow the melody, common in modern pop production.


Motif Track

Source: src/track/motif.cpp (~630 lines)

For BackgroundMotif composition style (BGM-only mode). Creates repeating patterns that serve as the primary melodic element, allowing the vocal to take a background role or be omitted entirely.

Parameters

cpp
struct MotifParams {
    MotifLength length;           // 0=auto(2 bars), 1, 2, or 4 beats
    RhythmDensity rhythm_density; // 0=Sparse, 1=Medium, 2=Driving
    MotifMotion motion;           // 0=Stepwise, 1=GentleLeap, 2=WideLeap, 3=NarrowStep, 4=Disjunct
    RepeatScope repeat_scope;     // FullSong, PerSection
    MotifRegister register_;      // 0=auto(mid), 1=low, 2=high
    uint8_t note_count;           // 0=auto(6), 3-8
};

Override Parameters

When motif overrides are specified in the config, the following parameters take precedence over style defaults:

ParameterTypeDescription
motifLengthint (0=auto, 1/2/4)Override motif length in beats (0 defaults to 2 bars)
motifNoteCountint (0=auto, 3-8)Override number of notes in the motif (0 defaults to 6)
motifMotionint (0xFF=preset, 0-4)Override motion type (0=Stepwise, 1=GentleLeap, 2=WideLeap, 3=NarrowStep, 4=Disjunct; internal 5=Ostinato for Blueprints only)
motifRegisterHighint (0=auto, 1=low, 2=high)Override register range
motifRhythmDensityint (0xFF=preset, 0-2)Override rhythm density (0=Sparse, 1=Medium, 2=Driving)

Pattern Generation

MotifMotion values (API: 0-4, internal: 0-5):

ValueNameDescription
0StepwiseScale steps only (2nds)
1GentleLeapUp to 3rds
2WideLeapUp to 5ths
3NarrowStepNarrow scale degrees (jazzy)
4DisjunctIrregular leaps (experimental)
5OstinatoSame pitch class repeated (internal Blueprint use only)

Register Ranges

RegisterRange
MidC3 (48) - C5 (72)
HighC4 (60) - C6 (84)

Arpeggio Track

Source: src/track/arpeggio.cpp (~275 lines)

For SynthDriven composition style (BGM-only mode). Creates arpeggiated patterns that serve as the primary harmonic/melodic element in electronic-style tracks.

Parameters

cpp
struct ArpeggioParams {
    ArpeggioPattern pattern;  // Up, Down, UpDown, Random, Pinwheel, PedalRoot, Alberti, BrokenChord
    ArpeggioSpeed speed;      // Eighth, Sixteenth, Triplet
    uint8_t octave_range;     // 1-3 octaves
    float gate;               // Note length ratio (0.0-1.0)
    bool sync_chord;          // Follow chord changes
};

Pattern Types (8 Total)

IDPatternDescription
0UpAscending through chord tones
1DownDescending through chord tones
2UpDownAscending then descending
3RandomRandom chord tone selection
4PinwheelAlternating direction pattern
5PedalRootReturns to root between each note
6AlbertiClassical broken chord (low-high-mid-high)
7BrokenChordIrregular chord tone ordering

Speed Conversion

cpp
Tick getNoteDuration(ArpeggioSpeed speed) {
    switch (speed) {
        case Eighth:    return TICKS_PER_BEAT / 2;    // 240
        case Sixteenth: return TICKS_PER_BEAT / 4;    // 120
        case Triplet:   return TICKS_PER_BEAT / 3;    // 160
    }
}

SE Track

Source: src/track/se.cpp (~15 lines)

Minimal track for section markers (text events only).

cpp
void generateSE(Song& song) {
    for (auto& section : song.arrangement.sections) {
        MidiEvent marker;
        marker.tick = section.start_tick;
        marker.type = MidiEventType::Text;
        marker.text = section.name;
        song.se.addEvent(marker);
    end
}

Velocity Calculation

Common velocity formula across tracks:

cpp
uint8_t calculateVelocity(
    uint8_t baseVelocity,
    int beat,
    SectionType section,
    float trackBalance
) {
    float beatAdjust = getBeatAccent(beat);      // Strong beats: +10
    float sectionMult = getSectionEnergy(section); // Chorus: 1.2

    return clamp(
        baseVelocity * beatAdjust * sectionMult * trackBalance,
        1, 127
    );
}

Track Balance

TrackBalanceNotes
Vocal1.00Lead instrument
Aux0.50-0.80Sub-melody support
Chord0.75Supporting
Bass0.85Foundation
Guitar0.70Accompaniment
Drums0.90Timing driver
Motif0.70Background
Arpeggio0.85Mid-level

Released under the MIT License.