An undeniable part of the human condition is our ability to create, and it is not only an ability but a desire that has a drive on us from the beginning of history. Since our daily days as both cave people and toddlers, we have been constantly looking for the opportunity to show just how creative we can be. For my boy, it was all about getting the most colorful patterns on the walls of our recently painted white walls, and seriously, he didn’t need crayons or colors, it could be guacamole or smashed beans he could use to share his art with the rest of the family. Once he grew a bit older and could develop his motor skills, we got him a little midi keyboard. He had a blast with his a-tonal creations, to say the least. Now, he is about to start school and I can’t avoid wondering if he will be as creative with his writing, once he masters that skill. Of course, I think my boy is special but I am sure that his creative behavior is quite normal among growing children. If we look at it from the bigger perspective of us as the human race, for the early man, this creativity took the form of cave paintings depicting their hunts and also abstract patterns. The Romantic Era gave us the mastery of Tchaikovsky with his ability to inspire very strong feelings of glory and also tragedy with something that is transmitted through the air as pressure variations (a bit of a fancy way of saying through sound!). In recent times I have found myself listening to more and more audiobooks. I love creative fantasy stories about fictional characters that can fly dragons (I have actually read the full GoT saga!).
We long to be creative and we give high recognition to the creative people around us.
It is not surprising that we have started to wonder about the ultimate limits of creativity:
Can we create something that is in itself creative?
This is a question that we will aim to answer in a series of tutorials at digitalemerge.ai. In order to try to answer this question, we will use generative modeling.
With the accelerated advances in multiple technologies, such as deep learning, nowadays we are able to build smart bots that can paint original artwork in a given style. We can also build similar bots that can write coherent paragraphs of text with a rich long-term structure. We find machines that can compose original musical pieces that are actually pleasant to listen to. An example of this creativity taken to the extreme was presented in a chapter of the Amazon original serie “Mozart in the jungle” (very good serie!).
One of the applications of creative machines that attracts me the most, are the machines that can develop winning strategies for complex games, such as Starcraft II and AlphaStar.
We are living the start of the generative deep learning revolution! There is no better time than the present to learn about this amazing technology, and you can do it here, at digitalemerge.ai.
We will divide this guide into several posts. The posts will make up to parts
- Introduction to Generative Deep Learning (GDL) and
- Introduction to the use of GDL to build creative machines
Part I. Introduction to GDL
During the first part of the guide, we will introduce the core techniques that we will need to start building generative deep learning models. We will first take a broad look at the field of generative modeling and explain how it differs from discriminative learning (we can think of discriminative learning as the more traditional ML). As we will see, we cannot solve everything with generative modeling and actually, we cannot solve almost any traditional application that artificial intelligence finds now in a business setting. Don’t get me wrong, this doesn’t mean that generative modeling won’t find its way into the business world! It is just that we humans have to be more creative in ways to find applications that generate value for the business (no pun intended).
In later parts of the guide, we will provide a brief introduction to the deep learning tools that we will need to build the most powerful set of generative models. As we will see, we need the power of deep learning to create the most powerful set of generative models. We will take a practical, hands-on approach rather than a theoretical analysis of the topic.
We will introduce Keras, a framework for building neural networks that can be used to construct and train the most cutting edge neural networks architectures published up to date. Perhaps, if we receive requests for it, we might work with other frameworks such as MXNet or PyTorch.
Once we have the foundations in place, and we introduce most of the practical aspects of deep learning we will work on creating our first generative deep learning model: the variational autoencoder. After this, we will present one of the coolest developments in the field of deep learning in recent years: generative adversarial networks (GANs).
Learning outcomes Part I
By the end of Part I in the guide, you will:
- Understand what Generative modeling is and how it differs from discriminative modeling
- Understand why we need Deep Learning to power generative modeling
- Understand deep learning at a practical level
- Understand what Keras is and how it can be used to build neural networks
- Understanding what a variational encoder is and its potential applications
- Understanding what a GAN is and its potential applications
Part II. Introduction to GDL to the use of GDL to build creative machines
The second part of the guide will be a natural continuation of Part I: the necessary knowledge necessary for this second part of the guide will be covered in Part I.
During Part I, we will have introduced the field of GDL and two of the most important advancements in recent years, variational autoencoders and GANs. During the rest of the guide, we will focus on a specific task: teaching machines how to be creative. We will cover the following creative fields:
Closing the guide with what the future applications and developments might be.
Part II will be exciting, we will introduce multiple necessary, very beautiful and powerful concepts such as Recurrent Neural Networks (RNNs) and deep reinforcement learning! RNNs will enable us to attack problems involving sequential data, such as text, music and more! When introducing RNNs, we will make a special emphasis on presenting the most used layers at a fundamental level: Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU). These layers find applications in multiple fields and we have actually had the joy to work with them in delivering a super cool project while working at Capgemini. They are powerful! And I can tell you, they can actually help save lives!
As we will see along the way, there are several key differences between, for example, text data, sound data, and image data. These differences have very dramatic consequences when it comes to the types of deep learning and generative methods that can be used in their analysis of them and ultimately, in their production. We will explain to you all you need to know in order to understand these at the deepest level! (again, no pun intended!)
Learning outcomes Part II
By the end of Part II in the guide, you will:
- Understand the problem of image, text, music and strategy generation
- Understand what type of deep learning layer and architectures you need to use for different types of problems
- Understand what reinforcement learning and deep reinforcement learning are
- Understanding what the RNN architecture is
- Understanding some of the most successful layers used in RNNs: LSTM and the Gated Recurrent Unite (GRU)
- Understand the potential applications of RNNs
- Know how to apply GDL to create an original painting
- Know how to apply GDL to create a question-answer generator
- Know how to apply GDL to compose original music
- Know how to apply GDL to create machines that can play games(!!!)
It is still not decided whether we will present all the concepts necessary to understand the guide at a fundamental level. We assume that the readers have experience coding in Python.
The models that we will introduce and create during the guide are described in mathematical notation, it will be useful, but not indispensable, to have an understanding of linear algebra (e.g. matrix multiplication) and probability theory. If we get feedback that these are concepts that we should explain, we will create a few posts about the main concepts needed.
Finally, you will need an environment in which to run the code examples we will provide (we will make available a GitHub repository you will find at https://github.com/digitalemerge ).
A useful resource for training deep learning models on accelerated hardware is Google Collaboratory ( https://colab.research.google.com ). It is a free Jupyter Notebook environment that requires no setup and runs in the cloud. We can tell the notebook to run on a GPU that is provided for free, for up to 12 hours of runtime.
We have been exploring other options such as Azure Notebooks. Azure offers a very powerful data science virtual machine with up to 8 or 12 processors and a GPU that you can use for free. For more information, visit: https://notebooks.azure.com/ . Unfortunately, we haven’t find an equivalent free offering at Amazon!
We expect to present part of the contents of this guide during the ODSC London 2019 in a full day workshop. I will extend an offer similar to the one I had during my presentation at ODSC Boston 2019: I will buy you pizza if you come and say hi during the workshop 🙂
We will be working hard to release the first post in the guide as soon as possible! In the meantime, stay creative!
About the writer
Arturo is a technologist born in Veracruz, Mexico and has been living in Norway since 2010. He completed his BSc and MSc in Mexico and his PhD in Trondheim, Norway. He is taking the last semester of his Executive MBA at BI in Oslo, to be concluded in September 2020.
During his experience as a scientific researcher, which started with the publication of a couple of papers in international journals after his BSc thesis, he has participated in several research projects. The results of these projects can be found in the 6 research papers shared below. The topics of these projects vary from theoretical physics, mathematics, statistics, and applications of Big Data technologies to understand human behavior. Part of Arturo’s training as an academic includes the use of different programming languages and scientific tools such as Mathematica, Matlab, C++, being Python one of his most common go-to tools.
After concluding his PhD work at NTNU, Arturo made a transition into the private sector in 2015. He started leading the development of the Mobility Analytics service in the business division of Telenor Norway. Arturo led the development of this service, from strategic aspects and all the way to the technical aspects of the development. Not only did Arturo work in tasks such as developing Go To Market strategies but also with hands-on development of GIS data visualizations and development of the code-base to power the Mobility Analytics Service.
The extensive experience as a leader and communicator in the scientific and private sector fields has made Arturo a firm believer that it is not often that leaders and executives make decisions based on hard, cold numbers, but that it is necessary to communicate the results from the analytic work via great story-telling and understanding of the business mechanisms in which value can be created. This realization made Arturo take the decision to start his EMBA program in 2019.
During his EMBA studies with a specialization in managing and developing digital enterprises, and which include a leadership program, Arturo has expanded his business knowledge to include fields such as digitalization strategies, innovation frameworks (such as design thinking lean and agile), entrepreneurship, HR management, and value creation and capture.
As his professional track shows, Arturo has a mind eager to always learn and apply his knowledge. He has taken several courses and certifications within Machine Learning and Artificial Intelligence and has virtually never stopped studying, even after concluding his doctoral degree.
Arturo enjoys mentoring and helping colleagues and people in general. He is always open to grab a coffee, a slice of pizza or a pint of beer, so… feel free to send him a message and connect with him, even if you haven’t met him in person.