Exploring the expressive capacity of Recurrent Neural Networks (RNNs) for tango lyrics composition

In this essay I describe a methodology to help curate tango lyrics that were created using machine learning (ML herein). The method is based on text analysis of the corpus used as input to guide the selection of ML-generated paragraphs containing words of interest that were frequently used by famous composers.

I've previously experimented with a tensorflow implementation of an LSTM (a particular type of RNN) known as 'char-rnn' to create tango lyrics from a corpus containing 1,427 songs (see previous essay here). Although the compositions obtained were interesting per se and certainly read and sounded as tango, they lacked a meaningful message as to be sufficiently inspirational, reflexive or thoughtful. If ML is to help the tango community compose new lyrics with the collective poetry of the past, extensive experimentation in generative text approaches are needed to achieve human-level lyrics that can ultimately be interpreted by singers and musicians and that can inspire dancers as well.

For this reason, I increased the corpus to a total of 4,874,604 characters spanning about 5,777 lyrics. This corpus encompassed the entire set of songs hosted on the website Todotango.com with their titles and signatures removed; and thus only the body for each lyric was included as the part holding the narrative component.

Training of the LSTM model was conducted at the Computer Science Laboratory, City University of Hong Kong, in a Dell computer (Precision T5810) holding a NVIDIA GeForce-GTx1080 graphic card. The necessary infrastructure and needed arrangements were kindly provided by Dr. Antoni Chan and his team members. The parameters used during training entailed: 3 layers of 500 neurons each; sequence length of 116 (because the average line in the corpus contained 29 characters, I assumed the LSTM should be unrolled for at least four lines as to effectively learn and remember the context of a paragraph); batch size of 32; and 100 epochs. When sampling the model, at least 13,000 characters were selected as the size of the generated text in order to apply a frequency count algorithm in which each of the top 100 words were repeated at least 3 times. Because of the stochastic nature of the outputted text, 3 samples were obtained to perform text analysis and compare them with the corpus of famous lyrics composers and the whole corpus as reference.

Comparative analysis of outputted text from the LSTM model to the corpus of 5 famous composers such as Pascual Contursi (1888-1932), Enrique Santos Discépolo (1901-1951), Héctor Marcó (1906-1987), Enrique Cadícamo (1900-1999), and Alfredo Le Pera (1900-1935); and to the whole corpus allowed me to conduct frequency counts in order to identify top words mentioned across all texts that could -in principle- be considered as 'recurrent tango concepts'. These 'recurrent concept words' could then guide the curation of generated text by the LSTM model in those lines of text and paragraphs containing them. Because the sequence length parameter of 116 when training the model would in average comprise 4 lines of text, it would be expected that the LSTM would learn and remember the context within the 4 lines of text containing any of the 'recurrent concept words' and thus be more meaningful or sensical when manually curating short pieces of text containing these words. The fact that an entire ML-generated lyric may not be significantly meaningful at the moment does not deprive me of finding local text contexts containing words of interest frequently used across composers.

When implementing a frequency-count Python algorithm to identify the top 100 most frequent words across the entire corpus (5,777 tango lyrics), I selected 10 concepts words that were of interest to me and that I felt represented well tango lyrics in general. These words can be arbitrarily organized as follow:

Human Emotions:

_Love (amor)

_Pain (dolor)

_Heart (corazón)

_Soul (alma)

Body Attributes:

_Eyes (ojos)

_Voice (voz)

Time (day/night & past/present):

_Night (noche)

_Today (hoy)

Life Experiences:

_Life (vida)

Personal Pronouns:

_I (yo)

_You (tu, tus, vos):

I then verified that these 10 words were also present in the corpus of the above mentioned composers as well as in the outputted text samples from the LSTM model (Figure 1).

Figure 1. Graph displaying the comparative relationship among top 100 most frequent words from corpus containing (a) all tango songs hosted at Todotango.com (5,777 lyrics in total); (b) all lyrics composed by a particular composer; and (c) lyrics outputted by sampling the LSTM model three consecutive times. Words of interest are highlighted in red whereas words related to body attributes are highlighted in blue. Words are arranged in columns from the most frequent to the less frequent one. Frequency count numbers are only shown for the entire corpus as reference (first column from right). At the bottom of each column the ration between distinct/unique words relative to total words in corpus is presented. As outputted lyrics from the LSTM model contain more new/nonsensical words, this ration is higher than the ration from the composer's corpuses presented. The average number of characters per line is presented on top of each column together with the total number of characters analyzed.

As Figure 1 shows, the outputted text from the LSTM model contained within its most frequent words a fair amount of words also present in the whole corpus and also in the corpus from the selected composers, a clear indication that the model has learned the overall architecture/compositional structure of the input text. Interestingly, differences in word usage among the composers selected can be observed. For instance, Contursi was the only composer who used the personal pronoun 'I' (yo) more frequently than 'You' (tu), revealing a self-centered/personally focused approach to writing lyrics. Regarding body attributes, we can observe that Marcó was keen to be inspired by, and write about, several body attributes [face (cara), mouth (boca), lips (labios), hands (manos) and chest (pecho) respectively] in addition to the most inspirational ones [eyes (ojos) and voice (voz)]. In the case of Le Pera, his writing style tended to be characterized by lines with few words/characters relative to the other four composers. This shows that comparative text analysis can be a powerful method to find insights relating to the collective poetry of tango from composers in the past and the associated poetry derived from the learning model when using machine learning approaches to lyric composition.

From the combined LSTM output, I selected lyric fragments containing the word 'love' (amor) as it was the most consistently frequent word after the personal pronouns and curated them base on their capacity to make me reflect, think and/or surprise because of the tango-like qualities. A set of the selected fragments in Spanish and their Google_Translate counterparts are shown below:



Mis ojos se hicieron un pedazo

de mi callejón.

Fuerte que una tarde,

mi pecho y la emoción,

me reproche con otro amor,

herido de locura,

que nunca te he de olvidar.

My eyes became a piece

of my alley.

Strong that one afternoon,

my chest and emotion,

reproach me with another love,

hurt of madness,

I will never forget you.




Vivo por tu bien...

la llamo en la dulzura

que ha sido guapear la tristeza

de otro corazón inocente

volverán tiempos de amor.

I live for your sake ...

I call her in sweetness

what has been to sadness

from another innocent heart

will return times of love.




Y aunque tuve quemando

la traición

no queda otra vez... ilusión...

sin reproche mil noches serán

este regreso que trae la pasión

por qué hasta las guitarras me fueron,

muy tristes, hechizos, al rencor de un beso,

recuerdo de una felicidad de amor.

And although I had burning


it's not over again ... illusion ...

without reproach a thousand nights will be

this return that brings the passion

why even the guitars went to me,

very sad, spells, to the rancor of a kiss,

memory of a happiness of love.


Following are curated fragments of machine-learning generated lyrics containing the word pain (dolor) as it can be considered part of and opposed to love as well:



Me chamuyó desde que existe un consuelo,

y vuelvo a buscar la emoción del dolor.

I chamuyó since there is a consolation,

and I go back to look for the emotion of pain.




Te espero. Adentro del rencor

yo no te hice para amarte

como nadie llora ni un beso

para embesezar mi alma dura

¡y un poco de dolor!

I wait for you. Inside the grudge

I did not make you to love you

as nobody cries nor a kiss

to empathize my hard soul

And a little pain!




Pero lo peor

ser la sombra de mi dolor

pienso mucho que sufrí tu maldad.

But the worst

be the shadow of my pain

I think a lot that I suffered your evil.


And here I include fragments of lyrics containing other words shown in Figure 1:



Lloré por vos

la vida vieja te apura

mi corazón siente cantar.

I cried for you

the old life hurries you

my heart feels singing.




Y aún hundiste en mi corazón,

en el cielo de tus orillas,

en el lugar de mi evocación,

que hicimos en los labios,

un sueño de pasión.

And you still buried in my heart,

in the sky of your shores,

in the place of my evocation,

what we did on the lips,

a dream of passion.




Dame tus encantos primores sin cesar

cierto tu mano ha tejido la adorada novia

de mi vida y de mi gran corazón.

Y al despertar mis dolores me hablaban,

y me parece que tú me digas que sí,

aquel pasado que no se va.

Give me your precious charms incessantly

true your hand has woven the adored girlfriend

of my life and my big heart.

And when I awoke my pains spoke to me,

and it seems to me that you say yes,

that past that does not go away.




Y recordando la tarde

desbordando tu voz de sol

y sus vías adorabas

y lastiman con sabor

¡mi mujer!...

And remembering the afternoon

overflowing your sunny voice

and your ways you adored

and they hurt with flavor

my wife!...


The incorporation of text analysis into a machine-learning text generation pipeline opens up a multitude of research and experimentation avenues that could help curate more meaningful tango lyrics. Text analysis can also be incorporated on those approaches that focuses on word embeddings rather than on character-based modeling. The creation of word embeddings oriented to tango lyric composition in conjunction with more robust text analysis methodologies such as Markov models will greatly improve the process of curation from LSTM-generated text. Furthermore, text analysis can be applied at the input state in order to structure the corpus text in a way that facilitates learning. Additionally, it would be interesting to experiment with different RNNs/LSTMs implementations and compare their expressive capacity in relation to frequency of words as I've done here for 'char-rnn'. A tentative list of models to try is provided below: