Transfer Learning

Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task.

In transfer learning, we leverage prior knowledge from one domain into a different domain. The way transfer learning is done is by deleting the last output layer and creating a new set neural network layers for the new problem. Then these layers are trained using the new data set.

For example, let’s say you have an AI model to recognize cats, now we can use that knowledge to recognize elephants. The model for recognizing cats is created by training the model with pictures of cats (plenty on the internet). Once the model is trained to recognize cats with high accuracy, then the last layer of the neural network will be replaced with additional layers and those layers will be trained using pictures of elephants to recognize elephants. This is done so that a lot of the low-level features like detecting edges, curves, etc. could be learned from the large dataset (in this case Cats) and the newer model will be trained to recognize specific elements (elephants specific features) with fewer data as shown in the below figure.

 

Most of the success today in achieving high accuracy in AI models has been driven by extensive supervised learning which relies on large amounts of labeled datasets. For simple use cases, large amounts of labeled public data is available through various online sources (Ex: ImageNet, WordNet, etc.) but if you are building a model for a specific domain solution, large amounts of labeled data is hard to obtain or data will need to be cleaned and labeled manually for building the model. Transfer learning enables you to develop fairly accurate models using comparatively little data. This is very useful at enterprises that might not have a lot of clean labeled data.

Therefore on some problems where you may not have very much data, transfer learning will enable you to develop skillful models that you simply could not develop in the absence of transfer learning.

Knowledge Integration

Knowledge Integration in AI

So let’s think about how humans learn, we humans are very good at continuously enriching and refining our knowledge and skills by seamlessly combining existing knowledge with new experiences. We exhibit a wide spectrum of learning abilities in various fields. We can be lawyers during the day and go play tennis or go for a run in the evening and make dinner at night. We are fairly adept at doing multiple tasks. When you think about AI systems, that is usually not the case. AI systems are very good at doing a specific task through machine learning alternatively called Narrow Intelligence.

Despite recent breakthroughs and advances, machine learning has a number of shortcomings when it comes to obtaining knowledge in various fields and in developing methods to identify how new and prior knowledge interact to gain more insights. Knowledge integration is the process of synthesizing multiple knowledge representations into a common model. It represents the process of how new information and existing information interact, what effects will the new information will have on existing knowledge and if existing information needs to be modified to accommodate new information.

Why is this concept important? It is important for building a better machine learning model for enterprise knowledge insights.  Not all knowledge will be readily available or can be fed into the machine learning model at once. Substantial knowledge bases are developed incrementally and a growing body of knowledge will need to be added separately. By identifying subtle conflicts and gaps in knowledge, KI facilitates better learning models. Large firms like Google are using a combination of Symbolic AI, Deep learning and Supervised learning to create better knowledge understanding and knowledge reasoning.

If you are an organization looking to extract valuable information and identify patterns within your data to create efficiency, these concepts are critical and I highly recommend doing further research around these to achieving success.

Ai Microservice

Deploy AI Models as Microservice

Microservice is a software development technique for developing an application as a suite of small, independently deployable services built around specific business capabilities. Microservices is the idea of breaking down big, monolithic application into a collection of smaller, independent applications.

Why should machine learning models be deployed as microservices?

This is an empirical era for machine learning as successful as deep learning has been, our level of understanding of why it works so well is still lacking. Machine learning engineers need to explore and experiment with different models before they settle on a model that works for their specific use case. Once a model is developed there are inherent advantages to deploying machine learning models in a container and serving it as microservices.

Here are a few reasons to why it makes sense to deploy AI models as microservices:

  • Microservices are smaller and are easier to understand as opposed to large monolithic application. Microservices are focused around business functions and so it makes it simpler to deploy a single specific function without worrying about all the other business functions.
  • Each service can be deployed independently of each other. This also allows for independent scaling of each service as opposed to the entire application. This is a much efficient way of using computing capabilities and will achieve a balance of computing resource allocation. Microservices deployed in a container architecture allows for further efficiency in scaling.
  • Because each service is focused on a specific business function, it makes it easier for development resource(s) to understand a small set of function rather than the entire application.
  • Making the model as a service provides the ability to expose the services to both internal and external applications without having to move the code. The ability to access data using well-defined interfaces. Containers have mechanisms built in for external and distributed data access, so you can leverage common data-oriented interfaces that support many data models.
  • Each team also has the luxury of choosing whatever languages and tools they want for their job without affecting anyone else. It eliminates vendor or technology lock-in. By deploying Machine Learning models as Microservices with API endpoints, the data scientists and AI programmers can write models in whatever framework- Tensorflow, PyTorch or Keras, without worrying about the technology stack compatibility.
  • Microservices allow for deployment of new versions in parallel and independent of other services. Developers can work in parallel and get changes to production independently and faster. Enables the continuous delivery and deployment of large, complex Machine Learning applications. With production ready frameworks like Tensorflow Serving, the management of versions of a model become very easy.
  • Deploy to any environment local, private or public cloud. If there are data privacy concerns on deploying AI models on the cloud, creating individual models as containers allow for deployment of AI models in the local environment.
  • In most AI projects, there will be several AI models that will be developed to do specific functions (ex: A model to do Named Entity Recognition, Model to do Information Extraction etc.). Microservices allows for these models to be independently developed, updated and deployed.

Now let’s talk about some technologies that help with deploying models as microservices. Here we want to focus on two prominent technologies that allow for this to happen.

Docker:
Docker helps you create and deploy microservices within containers. It’s an open source collection of tools that help you build and run any app, anywhere. Here is a great resource on Docker Basics. There are plenty of resources out on the internet for getting started with Docker as well.

Kubernetes:
When it comes to deploying microservices as containers, another aspect that should be kept in mind is the management of individual containers. If you want to run multiple containers across multiple machines – which you’ll need to do if you’re using microservices, you will need to manage these efficiently. To start the right containers at the right time, make them talk to each other, handle storage and memory considerations, and deal with failed containers or hardware. Doing all of this manually would be a nightmare and hence having a tool like Kubernetes is critical. Kubernetes is an open source container orchestration platform, allowing large numbers of containers to work together in harmony, reducing operational burden.

When used together, both Docker and Kubernetes are great tools for developing a modern AI cloud architecture.

Deep Learning

What is Deep Learning?

Deep Learning is a subset of machine learning that allows machines to do tasks that typically require human like intelligence. The inspiration for deep learning comes from neuroscience, if you look at the architecture of Deep Learning Neural Networks, they are connected in a fundamental way that mirrors the brain. Deep-learning networks are distinguished from the more commonplace neural networks by their depth; that is, the number of node layers through which data passes in a multistep process.

Earlier versions of neural networks were shallow, composed of one input and one output layer, and at most one hidden layer in between. More than three layers (including input and output) qualifies as “deep” learning. So deep as strictly defined means more than one hidden layer.

Neural Network

Deep learning Neural network

In deep-learning networks, each layer of nodes trains on a distinct set of features based on the previous layer’s output. The further you advance into the neural net, the more complex the features your nodes can recognize, since they aggregate and recombine features from the previous layer.

Let’s take a simple example of recognizing hand written numbers from 1 – 10. If 10 people wrote the numbers, the numbers will look very different from each person. For a human brain, it is fairly easy to identify these numbers. For a traditional machine it is impossible to detect and hence Neural Networks are used to mimic the way, neurons in the brain interact. These multiple hidden layers allow a computer to determine the nature of a handwritten digit by providing a way for the neural network to build a rough hierarchy of different features that make up the handwritten digit.

For instance, if the input is an array of values representing the individual pixels in the image of the handwritten figure, the next layer might combine these pixels into lines and shapes, the next layer combines those shapes into distinct features like the loops in an 8 or upper triangle in a 4, and so on. By building a picture of these features, neural networks can determine with a very high level of accuracy the number that corresponds to a handwritten digit. Additionally, the model will learn which links between neurons are critical in making successful predictions during training. Over the course of several training cycles, and with the help of occasional manual tuning, the network will continue to learn and generate better predictions until it reaches desired accuracy.

Thus, Deep learning allows machines to solve complex problems even when using a data set that is very diverse, unstructured and inter-connected. Deep learning networks excel at dealing with vast amount of disparate data. In fact, the larger the amount of data the more efficient Deep learning becomes and the more deep learning algorithms learn, the better they perform.

Few additional links on this topic:
MIT Technology Review: https://www.technologyreview.com/s/513696/deep-learning/
Cambridge Univerisity paper: https://bit.ly/2Fbbrlr