Message Boards Message Boards

[WSC19] Translating Human Dialogue into Music

Posted 5 years ago

Introduction:

The goal of this project is finding the difference in the tone of different dialogues through the translating human dialogue into music. Music has different keys that the music is in. How about human dialogue translated into music? I would like to translate speech or dialogue of people into the language of music and observe the connection between the tone of the speech and the musical key of the speech. I’m interested in applying such idea to an automatic translator or a speech transcription through voice. I have been into music since I was very young. I grew up as a violinist and started producing music. Now, I play the guitar and the piano. I always loved listening to music. Through Wolfram Language audio processor, I got to learned about the music inside our daily speeches. I wanted to go deeper in the subject and observe is there are any musical quality within our spoken language.

Background Information:

About musical quality, I decided to call that if played notes or chords are in a specific musical key, such as C major or D minor or a Mixolydian mode, the notes and chords have a musical quality. Therefore, the important part about this project is detecting the key of the played sound.

Algorithms and Code:

Converting music into quantified data

The first operation that has to be done is collecting the data of pitch the speech is giving. Convert them into a simple chart where the received speech turns into a pitch data, using PitchRecognize function.

audio file

davepitchdata=PitchRecognize[davecomedy,"QuantizedMIDI"];

Organizing the data

1) Duplicate Pitch Elimination

During this process, all the duplicate pitches were deleted.

davepitchDataClean=DeleteCases[davepitchdata["Values"],Missing[]];

2) Putting the data into a MIDI range

After the duplicate deletion, the pitch data was first shifted into the MIDI note data range, which is from -64 through 64. This was done through using Rescale function.

daveroundednote=Round[Rescale[#,MinMax[davepitchDataClean],{-60,48}]&/@davepitchDataClean];

3) Putting the data into a musical scale

Before any further in the code to put the data in a scale, the musical scale had to constructed. This was done through, first, creating all the modes in string. Then, depending on the mode's, or the scale's, range, calling 0 the Middle C, the numbers or notes were assigned to each mode. After, the rescaling method used in putting the pitch recognition data into the MIDI range to have different octaves of modes as well.

modes = {"Ionian", "Dorian", "Phrygian", "Lydian", "MixoLydian", 
   "Aeolian", "Locrian"};
modeSystemStructure = {"Ionian" -> {0, 2, 4, 5, 7, 9, 11}, 
   "Dorian" -> {0, 2, 3, 5, 7, 9, 10}, 
   "Phrygian" -> {0, 1, 3, 5, 7, 8, 10}, 
   "Lydian" -> {0, 2, 4, 6, 7, 9, 11}, 
   "MixoLydian" -> {0, 2, 4, 5, 7, 9, 10}, 
   "Aeolian" -> {0, 2, 3, 5, 7, 8, 10}, 
   "Locrian" -> {0, 1, 3, 5, 6, 8, 10}};
modeSystemC = (# -> 
      Table[(# /. modeSystemStructure) + 12 i, {i, 0, 9}] &) /@ modes;
RotateMode[mode_] := 
 MapThread[
  Rule, {RotateLeft[modes, 
    mode /. MapThread[Rule, {modes, Range[0, 6]}]], (mode /. 
      modeSystemC)[[1]]}]
BuildScaleSystem[modeSystem_, mode_String] := Function[u,
   Module[{systemMode = u},
    u -> ((u /. modeSystem) + (u /. RotateMode[mode]))
    ]
   ] /@ (RotateMode[mode][[All, 1]])
BuildScaleSystem[modeSystemC, "MixoLydian"]

LocrianScale = 
  Flatten[Union[Table[{0, 2, 4, 5, 7, 9, 10} - 12 i, {i, 1, 6}], 
    Take["Locrian" /. BuildScaleSystem[modeSystemC, "Locrian"], 1]]];

davescalednote = Nearest[LocrianScale, #] & /@ daveroundednote;

4) Choosing the instrument and arrangement

The instrument was employed by using an if statement within the Sound function. I arranged the MIDI data for two separate instruments: Woodblock and polysynthesizer. The woodblock was used for lower octave notes and the polysynthesizer was used for upper octaves.

daveoutput=Sound[SoundNote[#,0.009,If[Mean[#]<= -20,"Harp","Organ"]]&/@davescalednote]

5) Hear the Final Product!

https://soundcloud.com/jamie-lim-14/dave-chappelle-standup-comedy-converted-to-music

Problems / Rooms for Improvements

There were some minor troubles during the process, such as putting the quantified information of human speech in the MIDI number range and fitting the quantified information into a specific scale. The MIDI note numbers ranged from -64 through 64, with 128 keys, while the quantified numbers are in the range of 100 through about 180.

Main Results:

The bigger dynamic range there is in the dialogue’s volume, the louder and the bigger dynamic range there was in the musical

The translation into music can be set into different modes of music (Locrian, Aeolian, Dorian, Ionian, Mixolydian, Lydian, Phrygian).

Future Work

Exploring alternate ways to translate dialogue into music, such as incorporating velocity as a part of the file quantity.

Putting all translations into he same musical key and tempo

https://github.com/limjaeyoon/DialogueMusicalAnalysis.git

Attachment

POSTED BY: Jae Yoon Lim
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract