• Martin Calvino

Human-Machine creativity: implementing artificial intelligence for art making

Updated: May 18

SUMMARY

Much of today’s art created with Artificial Intelligence (AI) algorithms is the result of computer scientists and artists using Generative Adversarial Networks (GANs) to generate realistic images. Art made with GANs is a new field of study and artistic practice that have gained much momentum and media attention. However, as artists use pre-assembled image datasets to train their models, the field risks to exhaust its aesthetics by becoming visually homogeneous. In the work presented here, I've circumvented this problem by assembling my own image dataset derived exclusively from my artistic portfolio containing hand-made as well as algorithmic artworks. I present original paintings recently created by training a Deep Convolutional Generative Adversarial Network (DCGAN). The visual output from these AI generated paintings were incorporated back into abstract artworks of my own creation using traditional means; and these could subsequently be used as input to the DCGAN algorithm. When implemented iteratively, this procedure can lead to coevolution of human-machine creativity, as the artistic output of each entity (human and AI) influence each other.    


CONTEXT

Non-human intelligent agents have been a recurrent topic in science fiction; take for instance Stanley Kubrick's film '2001: A Space Odyssey' released in 1968, William Gibson's novel 'Neuromancer' released in 1984, or Laeta Kalogridis' television series 'Altered Carbon' released in 2018. Broadly speaking, works of science fiction inspired on artificial intelligence (AI) could be categorized in utopian (pointing to potential benefits of AI), or dystopian (pointing to potential dangers of AI). Despite the propensity of science fiction authors to portray in their works machines with human-like intelligence, society in general has grappled with the idea of machines exhibiting intelligent behavior, and perhaps even more difficult to accept that they even might be creative. On the contrary, a few artists with technological inclinations have been exploring the idea of creativity in artificially intelligent systems since the 1970s. Pioneers in this field include Harold Cohen, Ken Feingold, David Rokeby, Lynn Hershman and George Legrady.


These artists saw the computer (and the underlying algorithms that made it work) not only as a tool at their disposal, but also as a creative entity on its own right that could assist them throughout their creative process. Their view at the time aligned well with the emerging field of computational creativity; defined as the building of software exhibiting various degrees of autonomous behavior that could be considered creative by humans.


Since the 1970s until today, the field of artificial intelligence has advanced dramatically mainly ought to innovations in hardware that allowed for increased computation, and ample availability of data used to train learning algorithms. Today, there is a breadth of technology-oriented artists that use AI tools to craft new images; with their creative processes empowered by AI algorithms that are freely available and shared among computer scientists and artists. One of these algorithms is called Generative Adversarial Networks (GANs), and was first introduced by Ian Goodfellow in 2014, then a PhD student in computer sciences at the University of Montreal.


Artists co-opt GANs when creating art such that the algorithm would learn a given visual aesthetic based on the analysis of thousand of images previously selected by the artist and fed as input to the algorithm. Subsequently, the algorithm attempts to generate new images that bear a resemblance to the visual characteristics it has learned from the inputted images. For instance, the Paris-based art collective Obvious (with borrowed code from another artist, Robbie Barrat) trained a GAN model using thousand of portrait images that eventually produced a Rembrandt-like painting they named 'Portrait of Edmond Belamy'; which was later sold by Christie's in October of 2018 for $432,500. This signified a landmark moment for AI art, sometimes called GANism, that brought the field into the main stage of the world of art, and at the same time caused great media attention. Other GAN-generated paintings that have been sold at auctions include Mario Klingemann's 'Memories of Passersby I' sold by Sotheby's for $52,800 in March of 2019; and Ahmed Elgammal's 'St. George Killing the Dragon' that was sold for $16,000 in November of 2017. The field has grown to include many artists such as Memo Atken, Helena Sarin, Chris Peters, Anna Ridler, Jake Elwes, Tom White, Trevor Paglen, and Gene Kogan. At the same time, several museums and exhibition spaces are producing shows highlighting AI art; including the ongoing 'Uncanny Valley: Being Human in the Age of AI,' at the de Young Museum in San Francisco; and the recently finished 'The Question of Intelligence — AI and the Future of Humanity' at the The New School’s Anna-Maria and Stephen Kellen Gallery. In this shows, artist not only present their work generated by AI means, but also question the social, economic and cultural implications of AI in contemporary times.


As GAN-generated art continues to develop, a limitation starts to become evident: many artists using the same pre-assembled image datasets are producing similar imagery and at the same time exhausting the aesthetic possibilities of this new art genre. Because of this, most of GAN-generated artworks look homogeneous and repetitive in nature. Furthermore, only few artists engage further with the image output produced by the model, as means to deepen their artistic inquiry into new directions. Instead, they move on to a new pre-assembled image dataset and repeat the process. Contrary to this, the work I present here is based on GAN-generated artworks characterized by novel and refreshing visual characteristics derived from training the model with my own image dataset. This dataset is comprised from my own artistic portfolio that includes hand-made art (abstract paintings on canvas) as well as algorithmic art (art made by writing computer code) and digital drawings (art made with a Wacom tablet), respectively. Furthermore, I've used the image output of different GAN models to further develop my abstract paintings by incorporating in them visual elements produced by the algorithms; encouraging me to continually paint in a traditional manner (acrylic on different supports), and deepening the understanding of my own work along the way.


Although the work described here derived from my first artistic experimentation with GANs, it extends previous efforts in applying machine learning algorithms for creative purposes. My work includes tango lyric generation and microRNA gene synthesis using Recurrent Neural Networks (LSTMs); image classification of hand-made versus algorithmically-made artworks using Convolutional Neural Networks, with visualization of the classification process; and detection of political ideology from news articles using document clustering and topic modeling.



GENERATIVE ADVERSARIAL NETWORKS EXPLAINED

GANs enable computers to create realistic image data by using not one, but two, separate neural networks that are simultaneously trained: one (the Generator) is trained to produce fake images, and the other (the Discriminator) is trained to distinguish the fake image from real images (Figure 1). Generative points to the overarching objective of the model: creating new image data. Adversarial indicates to the game-like, competitive dynamic among the two models that comprise the GAN framework. Networks refer to the class of machine learning models commonly used to embody the Generator and Discriminator: neural networks. In the work I present here, the networks implemented were deep convolutional networks, a GAN variant known as DCGAN (deep convolutional generative adversarial networks). The goal of the Generator is to create image examples that represent the visual characteristics of the training dataset, so much that they look indistinguishable from the training data. In a sense, the Generator can be considered an object recognition model in reverse. Whereas object recognition algorithms learn to recognize visual patterns within images in order to distinguish an image's content, the Generator instead of recognizing visual patterns, learns to create them from scratch. This is evidenced by the fact that the Generator takes as input a vector of random numbers. The Generator then learns from the feedback it obtains from the classification performed by the Discriminator.


The objective of the Discriminator is to differentiate if a particular example is real (an image coming from the training dataset) or fake (an image produced by the Generator). This means that each time the Discriminator classifies a fake image as real, the Generator knows it did something well. On the other hand, each time the Discriminator effectively rejects a Generator-created image as fake, the Generator receives feedback that in needs to improve.


The Discriminator also keeps improving. As a classifier, it learns from measuring how far its predictions are from the true labels (real images from the training dataset or fake images produced from the Generator). When the Generator improves at creating realistic images, the Discriminator also improves at discerning fake images from real ones, and both networks progress to improve in parallel.


A key question during training a GAN model is when the training loop should be stopped. In other words, when do we know a GAN has effectively learned the distribution from the training dataset so we can identify the number of training iterations? This is a tricky question because the Generator and Discriminator have opposite purposes; whereas one improves, the other gets worse. They are said to engage in a zero-sum game that in theory should reach Nash equilibrium: a point during training in which neither network can get better at achieving its objective by changing their actions. For this reason, when in equilibrium a GAN is considered to have converged, although in practice is hardly ever achieved. Because of this, it cannot be verified if the distribution of the training dataset and the generated distribution have converged, and thus there is no criteria as to when to stop the training cycle. In practice, users decide on their own when to stop according to their visual taste on the generated images as the iterations progresses.

Figure 1. Visual description of the GAN framework taken from Langr, J and Bok, V (2019). GANs in action: deep learning with generative adversarial networks. Published by Manning, New York.

1. Training dataset - the dataset of real images for the Generator to learn to emulate. This dataset serves as input (x) to the Discriminator network.

2. Random noise vector - the raw input (z) to the Generator Network. This input is a vector of random numbers that the Generator uses as a starting point for synthesizing fake images.

3. Generator network - the Generator takes in a vector of random numbers (z) as input and outputs fake images (x*). Its objective is to make x* as close to x as possible.

4. Discriminator network - the Discriminator takes in as input either a real image (x) coming from the training set or a fake image (x*) produced by the Generator. For each image, the Discriminator determines and outputs the probability of wether the image is real.

5. Iterative training/tuning - for each of the Discriminator predictions, we determine how good it is -much as we would for a regular classifier- and use the results to iteratively tune the Discriminator and the Generator networks through back propagation: (a) Discriminator's weights and biases are updated to maximize its classification accuracy (maximizing the probability of correct prediction: x as real and x* as fake); (b) Generator's weights and biases are updated to maximize the probability that the Discriminator misclassifies x* as real.



TECHNICAL IMPLEMENTATION & RESULTS

For this project I experimented with three DCGAN implementations previously described by Matthew Mann, Jason Brownlee, and Jakub Langr and Vladimir Bok, recpectively. All implementations were in Python using TensorFlow as backend and Keras library to construct the GAN models. For each implementation I changed hyper parameters such as kernel size, activation function, batch size and number of iterations (epochs) and observed the visual output to select the most appropriate configuration that gave the best imagery. Differences and similarities among code implementations will be written about on a subsequent essay.


Two image datasets of my own were used: one contained 740 images, and the other 2,549 images, respectively. All images were derived from my artistic portfolio comprising the following types of artworks (Figure 2): (1) abstract paintings (acrylic on canvas, wood and cardboard); (2) drawings (pen and/or sharpie on paper, digital drawings with Wacom tablet); (3) algorithmic art (visual works created by writing computer code in Processing, JavaScript, Max-MSP-Jitter). All images in the training set were resized to 256x256 pixels.


Figure 2a. Examples of artworks created with hand gestures. This section of the dataset was previously described in my image classification work.


Figure 2b. Examples of algorithmic artworks. This section of the dataset was previously described in my image classification work.


In order to get acquainted with the whole process of training a DCGAN model, I first run several small and quick proof-of-concept experiments with super small training dataset. This gave me confidence of what to expect and provided me with an idea of the visual output I might be getting after training is complete. The first successful 'big' experiment (artistically speaking) was conducted with the small dataset of 740 images and conveyed a training cycle of 29,037 iterations/rounds (314 epochs). The DCGAN model was trained on the CPU of my MacBook Pro laptop and took several hours. Visual results outputted by the model are shown on Figure 3.


Figure 3. Visual output of DCGAN model after 314 epochs. Each square within an image is a 'painting' generated by the model. The latent space from the Generator was sampled every 100 iterations to produce 48 paintings that were arranged in 8x6 format. Click on images to enlarge.


The results were artistically exciting to me, it appears that the generated images containing curved lines and deformed circles somehow resembled those of my artwork used as input. Furthermore, although the colors generated resembled in part those I previously used throughout my works, the model have combined them in novel and surprising ways. From the total output of the model, I selected 96 paintings that were deemed 'interesting' by me. By interesting I consider novel forms and color combinations that I could incorporate back into my practice of abstract painting (Figure 4). While several of the images outputted by the model could be exhibited 'as is', the rest provided me with innovative ideas to explore further.


Figure 4. Abstract artworks (sharpie, pen, pastels and water color on cardstock paper) inspired on some of the visual elements outputted by the DCGAN model that were shown on the paintings from Figure 3.


Based on the positive outcome shown on Figure 3, I performed a second experiment with an enlarged dataset of 2,549 images. The training of the DCGAN model was again done on my laptop and lasted 43 hours. The training cycle consisted of 50,000 iterations (157 epochs) with sampling of the latent space (48 images) every 100 iterations. This allowed me to asses the output of the images along iterations. Several reports have indicated that the best quality images are usually outputted during the middle of the training process. This was not the case for this experiment, as I obtained the most interesting images towards the end of the training cycle (Figure 5):


0 to 10,000 iterations >> 14 paintings selected

10,000 to 20,000 iterations >> 49 paintings selected

20,000 to 30,000 iterations >> 25 paintings selected

30,000 to 40,000 iterations >> 24 paintings selected

40,000 to 50,000 iterations >> 81 paintings selected


Figure 5. Visual output of DCGAN model after 157 epochs. Each square within an image is a 'painting' generated by the model. The latent space from the Generator was sampled every 100 iterations to produce 48 paintings that were arranged in 8x6 format. Click on images to enlarge.


The visual output shown on Figure 5 was as interesting as the one shown on Figure 3, and validated the approach of using my own artworks as input to the model. It is important to note that the second dataset contained 100 images obtained from running the DCGAN model with the small dataset (740 images), and also included the artwork shown on Figure 4. Consequently, successive runs of DCGAN will include images from Figure 5 as added material to the training set.


One aspect of the outputted images from both experiments that surprised me was the almost lack of straight lines, since much of my algorithmic artworks contained them.


DISCUSSION

My work has demonstrated that by incorporating custom-made image datasets derived from my own artistic portfolio into a DCGAN pipeline, the resulting aesthetics of the images outputted by the model are by no means exhausted as is the case with previous GAN-generated works that have used pre-assembled image datasets. In this sense, the process by which my GAN-derived artworks have emerged are in accordance to those practiced by artists such as Helena Sarin and Anna Ridler who also create their own image datasets for training. Furthermore, I had gone a step further in relation to these artists and created abstract paintings containing visual elements generated by the DCGAN model (Figure 4), highlighting the innovating potential that the incorporation of this technology can have on my art practice. In this manner, my interaction with the algorithm is an active one: producing the input imagery, evaluating and curating the output, and appropriating the output back into art making.


By integrating color, form and spatial relations between visual elements that were generated by the GAN model based on what it learned from my art; and subsequently placing them into my current artworks created by traditional means, I was able to identify unconscious ‘artistic decisions’ that I had been taking all along without properly recognizing them as such. As machine-mediated communication with my unconscious aesthetic decisions allowed me to attest my own inner bias when making art. This procedure has important implications for machine-mediated human creativity in general, since AI algorithms that learn from an artist’s portfolio and subsequently produce art emulating that of the artist, can be incorporated back into the artist’s creative work. When this process is repeated over and over again, human-machine creativity can COEVOLVE. Coevolution of art making between artist and AI algorithm is to me the most interesting aspect of the interaction with an intelligent system. It is this aspect of my approach to AI-generated art that distinguishes what I do from the rest (Figure 4). Furthermore,the utilization of a contemporary technology has sparked in me a renewed interest in art making through traditional means, combining the new with the old.


My approach re-contextualizes the current argument of AI blurring the definition of artist; since the utilization of AI encouraged me to draw, paint and create art with established techniques and traditional procedures. This places me, the artist, at the center of the creative process independently of the technology being used. My artistic stance thus emulates my days as full-time scientist in which technological innovation and scientific procedures were used to help answer a concrete question and/or advance a specific line of research; rather than having a technological solution seeking for the right scientific question. With this in mind, I would like to express restraint towards the term GANism as originally coined by deep learning researcher and author Francois Chollet in 2017 to define the breadth of new artworks generated with GANs; mainly because it places the emphasis on the technology and not in the artist and his/her artistic procedure and development. Furthermore, there are other machine learning algorithms that are also generative (take for instance Recurrent Neural Networks) that have been used for artistic purposes that are by definition not included in the term GANism. Alternatively, when generative machine learning algorithms are used within the context of coevolution between the art produced by the algorithm in relation to the art produced by the artist, the emphasis is on the procedure regardless of the generative algorithm used. For this reason, I would prefer the term COEVOLUITIONISM to properly frame the creation of art under the circumstances and procedures I just described in this text.


Figure 4. Schematic diagram depicting the procedure I’ve developed to continually create AI-inspired art as means to further develop my style of abstract painting.



ACKNOWLEDGEMENTS

I want to thank Matthew Mann from the Computer Science Department at University of Regina for helping me with code implementations of DCGANs variants.



REFERENCES

Bailey, J (2018) Helena Sarin: why bigger isn’t always better with GANs and AI art. URL:

https://www.artnome.com/news/2018/11/14/helena-sarin-why-bigger-isnt-always-better-with-gans-and-ai-art

Borowska, K (2019) AI gets creative thanks to GANs innovations. URL:

https://www.forbes.com/sites/kasiaborowska/2019/02/15/ai-gets-creative-thanks-to-gans-innovations/#45c8e9a9708c

Broeckman, A (2016). Machine art in the twentieth century. Published by MIT Press, Cambridge.

Brownlee, J (2019). Generative adversarial networks with Python. Published by Machine Learning Mastery.

Calvino, M (2019) Automated detection of political ideology from text: a case study of newspapers in Uruguay. URL:

https://www.martincalvino.co/post/2019/11/12/automated-detection-of-political-ideology-from-text-a-case-study-of-newspapers-in-uruguay

Calvino, M (2019) Visualization of Convolutional Neural Network’s representation of images from Calvino’s artworks. URL:

https://www.martincalvino.co/post/2019/09/07/visualization-of-convolutional-neural-networks-representation-of-images-derived-from-hand

Calvino, M (2019) Image classification of Calvino’s artworks using Convolutional Neural Networks. URL:

https://www.martincalvino.co/post/2019/07/19/image-classification-of-calvinos-artworks-using-convolutional-neural-networks

Calvino, M (2018) Creation of synthetic microRNA169 gene copies using machine learning. URL:

https://www.martincalvino.co/post/2018/08/06/creation-of-synthetic-microrna169-gene-copies-using-machine-learning-artistic-purpose-at

Calvino, M (2018) Exploring the expressive capacity of Recurrent Neural Networks (RNNs) for tango lyrics composition. URL:

https://www.martincalvino.co/post/2018/07/04/exploring-the-expressive-capacity-of-recurrent-neural-networks-rnns-for-tango-lyrics-comp

Calvino, M (2018) AI-generated tango: pure machine creativity churns out tango lyrics like those of the golden age. URL:

https://www.martincalvino.co/post/2018/06/10/ai-generated-tango-pure-machine-creativity-churns-out-tango-lyrics-like-those-of-the-gold

Chollet, F (2018). Deep learning with Python. Published by Manning, New York

Cohn, G (2018) Up for bid, AI art signed ‘algorithm’

https://www.nytimes.com/2018/10/22/arts/design/christies-art-artificial-intelligence-obvious.html?searchResultPosition=3

Cohn, G (2018) AI art at Christie’s sells for $432,500. URL:

https://www.nytimes.com/2018/10/25/arts/design/ai-art-sold-christies.html?searchResultPosition=6

de Young Museum (Accessed on April, 2020) Uncanny Valley: being human in the age of AI. URL:

https://deyoung.famsf.org/exhibitions/uncanny-valley

Elgammal, A (2019) AI is blurring the definition of artist. URL:

https://www.americanscientist.org/article/ai-is-blurring-the-definition-of-artist

Elgammal, A (2018) 75% of people think this AI artist is human. URL:

https://www.fastcompany.com/90253470/75-of-people-think-this-ai-artist-is-human

Grimes, W (2016) Harold Cohen, a pioneer of computer-generated art, dies at 87. URL:

https://www.nytimes.com/2016/05/07/arts/design/harold-cohen-a-pioneer-of-computer-generated-art-dies-at-87.html?searchResultPosition=45

Kent, VT et al. (2017) Coevolution between transposable elements and recombination. Philosophical Transactions of the Royal Society B 372: 20160458

Langr, J and Bok, V (2019). GANs in action: deep learning with generative adversarial networks. Published by Manning, New York.

Lohr, S (2018) From agriculture to art – the A.I. wave sweeps in. URL:

https://www.nytimes.com/2018/10/21/business/from-agriculture-to-art-the-ai-wave-sweeps-in.html

Loos, T (2020) Artists Explore A.I., with some deep unease. URL

https://www.nytimes.com/2020/04/08/arts/design/ai-artists-exhibitions.html

López de Mántaras, R (2017) Artificial intelligence and the arts: toward computational creativity. URL:

https://www.bbvaopenmind.com/en/articles/artificial-intelligence-and-the-arts-toward-computational-creativity/

Mann, M website (Accessed on April, 2020). URL: https://matchue.ca/p/earthgan/

Metz, C and Collins, K (2018) How an A.I. ‘cat-and-mouse game’ generates believable fake photos. URL:

https://www.nytimes.com/interactive/2018/01/02/technology/ai-generated-photos.html?searchResultPosition=1

Paul, C (2015). Digital art (third edition). Published by Thames & Hudson, London.

Reas, C; McWilliams, C; LUST (2010). Form + code in design, art and architecture. Published by Princeton Architectural Press, New York.

Shani, O (2015) From science fiction to reality: the evolution of artificial intelligence. URL:

https://www.wired.com/insights/2015/01/the-evolution-of-artificial-intelligence/

Shanken, EA (2009). Art and electronic media. Published by Phaidon, London.

Schneider, T and Rea, N (2018) Has artificial intelligence given us the next great art movement? Experts say slow down, the ‘field is in its infancy’. URL:

https://news.artnet.com/art-world/ai-art-comes-to-market-is-it-worth-the-hype-1352011

Smith, A (2019) Chris Peters paints based on A.I. creations. URL:

https://hifructose.com/2019/02/27/chris-peters-paints-based-on-a-i-creations/

Sotheby’s (Accessed on April, 2020) Mario Klingemann, Memories of Passersby I. URL:

http://www.sothebys.com/en/auctions/ecatalogue/2019/contemporary-art-day-auction-l19021/lot.109.html?locale=en&clickdate=2020-04-26T16%3A23%3A37Z&ranMID=42390&ranEAID=TnL5HPStwNw&ranSiteID=TnL5HPStwNw-WM6W957GjZ3V4.U47DgVrA

The New School (Accessed on April, 2020) The Question of Intelligence – AI and the future of humanity. URL:

https://ww2.newschool.edu/pressroom/pressreleases/2019/AI.htm

Vincent, J (2019) A never-ending stream of AI art goes up for auction. URL:

https://www.theverge.com/2019/3/5/18251267/ai-art-gans-mario-klingemann-auction-sothebys-technology

Whitehead, H et al. (2019) The reach of gene-culture coevolution in animals. Nature Communications. DOI: 10.1038/s41467-019-10293-y

Wikipedia (Accessed on April, 2020) Artificial intelligence in fiction. URL:

https://en.wikipedia.org/wiki/Artificial_intelligence_in_fiction

ABOUT THE AUTHOR

Martin Calvino is a multimedia artist and scientist focused on abstract art, and the intersection of media arts with genomics, machine learning and tango culture. His work can be accessed at https://www.martincalvino.co

New York, United States