Natural Language Processing of facebook messages; and their inclusion into an abstract painting


Synopsis_

In this work the author attempted to hybridize aspects of contemporary digital communication such as Facebook text messages, personal data derived from social media use, text analysis and visualization using computer algorithms, and traditional abstract painting. As result the author conceived and created an abstract painting (acrylic on cardboard) having regular mail envelopes containing excerpts from incoming and ongoing Facebook messages. These messages were the result of communication exchanges between the author and 200 Facebook friends over a 2+ years period (January-1-2017 to March-17-2019). The work aims to be interactive in nature as visitors to the exhibit can interact with the painting by opening the envelops and reading the message excerpts. This work explores the possibilities of physical abstract painting as carrier of personal digital data to give visitors an insight into the social media activities from the author who created the painting.

Creation of abstract painting with regular mail envelops_

An acrylic on cardboard painting (of dimensions: 118 x 93 cm) having regular mail envelopes in it was created during March of 2019 (Figure 1). The visual content of the painting followed the artist's abstract style and was not directly related to data derived from text analysis of Facebook messages. The visual composition of the painting stand by its own and at the same time offer a visual trigger to viewers for them to realize the paintings contain regular mail envelops in them. These envelops are the 'interactive agents' that bridge physical and digital worlds as they contain not regular letters but excerpts derived from the digital exchange the author held with Facebook friends.

The painting have 20 mail envelops in it and thus provided the author with the opportunity to share with the audience 20 different conversational topics that took place via Facebook text messaging.

Figure 1. Acrylic on cardboard painting containing regular envelopes. The artwork harbors inside each envelop a printed excerpt from Facebook text messages the author exchanged with 200 hundred friends over a 2 years period.

Natural Language Processing of Facebook's text messages_

In order for the author to curate/select interesting conversational topics he has held with friends in Facebook over a two-year-period, he recurred to text analysis using Natural Language Processing algorithms written in Python. For this, the author manually created two text corpuses: one containing all outgoing Facebook's text messages from the author to his friends; an another containing all incoming Facebook's text messages from friends to the author (Figure 2). The text files contained messages written in English and Spanish. References to weblinks, email addresses, or phone numbers were removed from the corpus of text messages and thus any posterior analysis.

Figure 2. Sketch depicting the overall approach to curate interesting texts from Facebook messages exchange between the author and his friends on the social media platform. Two corpus texts were created: one containing outgoing messages from the author to 200 friends; whereas the other corpus contained incoming messages from 200 friends to the author. Both corpus were compared and analyzed using Natural Language Processing (NLP) and Complex Network Analysis (CNA) in Python. The insight derived from this approach guided the curation/selection of interesting text excerpts to place inside each mail envelope associated with the painting shown on Figure 1.

The Python's library Natural Language Toolkit (NLTK) was used to perform text-data-mining on the created text corpuses and to extract interesting insights on the topics written and discussed between the author and his friends. These insights then were considered as the 'conversational topics' to be included in printed format to be placed inside the mail envelops on the painting.

The corpus assembled from outgoing Facebook messages from the author to friends was 1.42 times larger than the corpus text assembled from incoming messages. Despite this fact, the lexical richness of incoming text messages from friends to author was higher than outgoing text messages from author to friends. Lexical richness is a measure of unique words used/written relative to the total size of the corpus text.

Text lengths:

(OuTe Corpus) Outgoing texts: 47,516 vocabulary items (tokens)

(InTe Corpus) Incoming texts: 33,390 vocabulary items (tokens)

Lexical Richness:

OuTe: 0.20 (each world is repeated 5 times on average)

InTe: 0.26 (each word is repeated 4 times on average)

When identifying the top 100 most commonly used words in each corpus text, their cumulative frequency graphs showed that they constituted 42.5% of the outgoing text corpus, and 39.9% of the incoming text corpus, respectively (Figure 3).

Figure 3a. Cumulative frequency graph for top 100 commonly used words from corpus text containing outgoing facebook messages from the author to his friends.

Figure 3b. Cumulative frequency graph for top 100 commonly used words from corpus text containing incoming facebook messages from friends to the author.

Although most of the frequently used words shown on figure 3 are words related to the inner workings of English and Spanish language (personal pronouns and articles for example), the author found 'work' and 'tango' interesting words to analyze because they were shared among the two corpus texts. It is interesting to note that whereas the author frequently mentioned words such as 'art', 'video', and 'media', these words were not frequently mentioned in incoming Facebook texts from friends.

Top_100_words:

OutTe Corpus: > 'work' (102 occurrences), 'tango' (187 occurrences), 'art' (92 occurrences), 'video' (71 occurrences), and 'media' (61 occurrences)

InTe Corpus: includes > 'work' (53 occurrences) and 'tango' (52 occurrences)

Since there are 20 mail envelopes for each of both paintings (shown on Figure 1), the author needed to find 20 interesting words in total that were shared among the two text corpuses. Because 'work' and 'tango' were initially found within the top 100 most frequently mentioned words, an additional 18 words of interest shared among the incoming and outgoing text corpuses needed to be found. For this reason, the author focused his attention to (1) words that were longer than 7 characters and had been mentioned more than 7 times; (2) collocation of words; and (3) words ending with 'ing'.

(OuTe) Words from OUTGOING Facebook Messages written BY AUTHOR to friends:

['(practice', 'Argentina', 'Diciembre', 'Facebook', 'Gracias!', 'Holidays!', 'Jusleine', 'Leonardo', 'Leonardo,', 'Montevideo', 'Princeton', 'Richard,', 'Saludos!', 'University', 'University.', 'algorithm', 'alternative', 'apologize', 'articulo', 'artificial', 'artistas', 'artistic', 'artworks', 'artículos', 'audiovisual', 'available', 'background', 'building', 'buscando', 'collaborate', 'collection', 'community', 'computer', 'consulado', 'contaminación', 'daughter', 'different', 'electronic', 'electronica', 'entonces', 'entrance', 'escribir', 'festival', 'following', 'gracias!', 'gracias.', 'haciendo', 'included', 'inspired', 'interesa', 'interesante', 'interest', 'interested', 'interesting', 'intersection', 'learning', 'material', 'mensaje.', 'message.', 'mientras', 'milonga.', 'milongas', 'multimedia', 'opportunity', 'organizar', 'original', 'performance', 'performance)', 'performance.', 'practica', 'practice', 'preguntar', 'promotional', 'propuesta', 'proyecto', 'realizar', 'regards,', 'remember', 'research', 'response.', 'resulting', 'saludos!', 'shooting', 'something', 'soundtrack', 'speakers', 'surrealistic', 'technology', 'thinking', 'together', 'tomorrow', 'tonight.', 'trabajar', 'traditional', 'upcoming', 'visualization']

(InTe) Words from INCOMING Facebook Messages written BY FRIENDS to author:

['Thursday', 'actually', 'alternative', 'available', 'different', 'electronic', 'entiendo', 'everything', 'festival', 'haciendo', 'important', 'interesting', 'learning', 'material', 'performance', 'practice', 'probably', 'proyecto', 'recording', 'schedule', 'something', 'thinking', 'tomorrow']

Shared words between OuTe and InTe are (with interesting words in bold):

alternative - available - different - electronic - festival - haciendo- interesting

learning - material - performance - practice - proyecto - something - thinking - tomorrow

A collocation is a sequence of words that occur together unusually often, and are characteristically resistant to substitutions with words that have similar senses. Clear examples from results below include the name of cities as collocations, such as Hong Kong, Buenos Aires, and New York.

Collocations of words in OuTe:

Hong Kong; message finds; would like; new media; gracias por; Kind

regards,; short film; media art; Happy Holidays!; creates

surrealistic; Martin Calvino; machine learning; two weeks; regards,

Martin; resulting images; Buenos Aires; surrealistic images;

promotional material; so. Happy; computer algorithm

Collocations of words in InTe:

Hong Kong; Hola Martín,; New York; Hola Martín!; muchas gracias;

Gracias por; gracias por; Muchas gracias; machine learning;

alternative room; Hola Martín.; get back; would love; Hey Martin,;

little bit; Hola Martin; right now.; 21.30 hs.y; hs.y sabados; Looking

forward

Interesting shared collocations:

Hong Kong - machine learning

The author was interested in analyzing the use of verbs or words ending with 'ing' and how their frequency may change in outgoing versus incoming Facebook messages:

Words in OuTe which ends with 'ing' and were mentioned more than 7 times:

['DJing', 'asking', 'being', 'bring', 'building', 'coming', 'dancing', 'doing', 'during', 'following', 'going', 'interesting', 'learning', 'leaving', 'letting', 'living', 'looking', 'morning', 'playing', 'resulting', 'sending', 'shooting', 'something', 'taking', 'talking', 'thing', 'thinking', 'trying', 'upcoming', 'using', 'working', 'writing']

Words in InTe which ends with 'ing' and were mentioned more than 7 times:

['being', 'dancing', 'doing', 'everything', 'getting', 'going', 'interesting', 'learning', 'looking', 'meeting', 'recording', 'something', 'talking', 'thinking', 'working']

Shared words between OuTe & InTe ending with '-ing':

being - dancing - doing - going - interesting - learning - looking - something - talking - thinking - working

Examination of shared-words contexts between incoming and outgoing Facebook messages_

A concordance view shows every occurrence of a given word and its context. This allows for comparing the context in which words that are shared between incoming and outgoing Facebook messages were written. Based on the text analysis previously shown, the author has identified 20 words that were shared between incoming and outgoing Facebook messages that are an interesting case for a concordance. These words and their contexts will be printed and included in the painting shown on Figure 1 by placing them inside the envelopes (20 envelopes - 20 shared words - 20 concordances) (Figure 4). Based on the above, the selected words are:

work - tango - alternative - electronic - festival - performance - proyecto - Hong Kong - machine learning - being - dancing - doing - going - interesting - learning - looking - something - talking - thinking - working

Figure 4. A print out of concordance for the word 'work' to be placed into regular mail envelopes associated with an abstract painting.

Let's take a look at the concordance of the word 'work':

Outgoing messages: 25 out of 102 matches displayed

. I left you a phone message at your work in Princeton. I wanted to reach out

to possible implement collaborative work Hi Keiichi, thanks for your kind wor

hi, thanks for your kind words. Your work is also very interesting! Perhaps we

program! Thank you so much for your work and help! sure, no rush. I am thinki

itional dancers (female and male) to work along with you? Hi Santiago and Beat

st part comes: that is to submit the work to several Film_Festivals in the Ind

I am confident you guys will make it work on the spot with improvisation. Alre

does Mondays and Wednesdays evenings work with you? The short film will probab

program! Thank you so much for your work and help! here is the written essay

st part comes: that is to submit the work to several Film Festivals in the Ind

stead of the studio and you guys can work it out as best accommodate this sure

the evening Jusleine just get out of work at 6PM and lives in NJ at what time

earsal ok, what days besides Sundays work with you? Female dancer is Jusleine

do it? Sundays are Mondays (anytime) work best for me. because it is on the su

-provoking research, commentary, and work relating to current discourse and em

ht y de ahi la conexión) Ryota! your work is amazing... Ahi te mande mensaje w

ch other any other time! How is your work going? Kind regards Muchas gracias M