Lecture 1: Introduction to Machine Learning¶

Applied Machine Learning¶

Volodymyr Kuleshov
Cornell Tech

Welcome to Applied Machine Learning!¶

Machine learning is one of today's most exciting emerging technologies.

In this course, you will learn what machine learning is, what are the most important techniques in machine learning, and how to apply them to solve problems in the real world.

Part 1: What is Machine Learning?¶

We hear a lot about machine learning (or ML for short) in the news.

But what is it, really?

ML in Everyday Life: Search Engines¶

You use machine learninng every day when use a search engine.

ML in Everyday Life: Personal Assistants¶

Machine learning also powers the speech recognition, question answering and other intelligent capabilities of smartphone assistants like Apple Siri.

ML in Everyday Life: Spam/Fraud Detection¶

Machine learning is used in every spam filter, such as in Gmail.

ML systems are also used by credit card companies and banks to automatically detect fraudulent behavior.

ML in Everyday Life: Self-Driving Cars¶

One of the most exciting and cutting-edge uses of machine learning algorithms is in autonomous vehicles.

A Definition of Machine Learning¶

In 1959, Arthur Samuel defined machine learning as follows.

Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed.

What does "learn" and "explicitly programmed" mean here? Let's look at an example.

An Example: Self Driving Cars¶

A self-driving car system uses dozens of components that include detection of cars, pedestrians, and other objects.

Self Driving Cars: A Rule-Based Algorithm¶

One way to build a detection system is to write down rules.

In [2]:

                
                    Copied!
                    
                        
                        
                    
                    

            
# pseudocode example for a rule-based classification system
object = camera.get_object()
if object.has_wheels(): # does the object have wheels?
    if len(object.wheels) == 4: return "Car" # four wheels => car    
    elif len(object.wheels) == 2:,
        if object.seen_from_back():
            return "Car" # viewed from back, car has 2 wheels
        else:
            return "Bicycle" # normally, 2 wheels => bicycle
return "Unknown" # no wheels? we don't know what it is
# pseudocode example for a rule-based classification system
object = camera.get_object()
if object.has_wheels(): # does the object have wheels?
    if len(object.wheels) == 4: return "Car" # four wheels => car    
    elif len(object.wheels) == 2:,
        if object.seen_from_back():
            return "Car" # viewed from back, car has 2 wheels
        else:
            return "Bicycle" # normally, 2 wheels => bicycle
return "Unknown" # no wheels? we don't know what it is

In practice, it's almost impossible for a human to specify all the edge cases.

Self Driving Cars: An ML Approach¶

The machine learning approach is to teach a computer how to do detection by showing it many examples of different objects.

No manual programming is needed: the computer learns what defines a pedestrian or a car on its own!

Revisiting Our Definition of ML¶

Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed. (Arthur Samuel, 1959.)

This principle can be applied to countless domains: medical diagnosis, factory automation, machine translation, and many more!

Why Machine Learning?¶

Why is this approach to building software interesting?

It lets us build practical systems for real-world applications for which other engineering approaches don't work.

Learning is widely regarded as a key approach towards building general-purpose artificial intelligence systems.

The science and engineering of machine learning offers insights into human intelligence.

Part 2: Three Approaches to Machine Learning¶

Machine learning is broadly defined as the science of building software that has the ability to learn without being explicitly programmed.

How might we enable machines to learn? Let's look at a few examples.

Supervised Learning¶

The most common approach to machine learning is supervised learning.

_{^{Image Credit: DataFlair}}

Supervised Learning: Object Detection¶

We previously saw an example of supervised learning: object detection.

We start by collecting a dataset of labeled objects.
We train a model to output accurate predictions on this dataset.
When the model sees new, similar data, it will also be accurate.

A Supervised Learning Dataset¶

Consider a simple dataset for supervised learning: house prices in Boston.

Each datapoint is a house.
We know its price, neighborhood, size, etc.

In [13]:

                
                    Copied!
                    
# We will load the dataset from the sklearn ML library
from sklearn import datasets
boston = datasets.load_boston()
# We will load the dataset from the sklearn ML library
from sklearn import datasets
boston = datasets.load_boston()

We will visualize two variables in this dataset: house price and the education level in the neighborhood.

In [14]:

                
                    Copied!
                    
                        
                        
                    
                    

            
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [12, 4]
plt.scatter(boston.data[:,12], boston.target)
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [12, 4]
plt.scatter(boston.data[:,12], boston.target)
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")

Out[14]:

Text(0.5, 1.0, 'House prices as a function of average neighborhood education level')

A Supervised Learning Algorithm¶

We can use this dataset of examples to fit a supervised learning model.

The model maps input $x$ (the education level) to output a $y$ (the house price).
It learns the mapping from our dataset of examples $(x, y)$.

In [15]:

                
                    Copied!
                    
                        
                        
                    
                    

            
import numpy as np
from sklearn.kernel_ridge import KernelRidge

# Apply a supervised learning algorithm
model = KernelRidge(alpha=1, kernel='poly')
model.fit(boston.data[:,[12]], boston.target.flatten())
predictions = model.predict(np.linspace(2, 35)[:, np.newaxis])

# Visualize the results
plt.scatter(boston.data[:,[12]], boston.target, alpha=0.25)
plt.plot(np.linspace(2, 35), predictions, c='red')
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")
import numpy as np
from sklearn.kernel_ridge import KernelRidge

# Apply a supervised learning algorithm
model = KernelRidge(alpha=1, kernel='poly')
model.fit(boston.data[:,[12]], boston.target.flatten())
predictions = model.predict(np.linspace(2, 35)[:, np.newaxis])

# Visualize the results
plt.scatter(boston.data[:,[12]], boston.target, alpha=0.25)
plt.plot(np.linspace(2, 35), predictions, c='red')
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")

Out[15]:

Text(0.5, 1.0, 'House prices as a function of average neighborhood education level')

Applications of Supervised Learning¶

Many important applications of machine learning are supervised:

Classifying medical images.
Translating between pairs of languages.
Detecting objects in autonomous driving.

Unsupervised Learning¶

Here, we have a dataset without labels. Our goal is to learn something interesting about the structure of the data.

_{^{Image Credit: DataFlair}}

Unsupervised Learning : Text Analysis¶

In this next example, we have a text containing at least four distinct topics.

However, we initially do not know what the topics are.

Unsupervised topic modeling algorithms assign each word in a document to a topic and compute topic proportions for each document.

Applications of Unsupervised Learning¶

Unsupervised learning methods have many other applications:

Recommendation systems: suggesting movies on Netflix.
Anomaly and outlier detection: identifying factory components that are likely to fail soon.
Signal denoising: extracting clean human speech from a noisy audio recording.

Reinforcement Learning¶

In reinforcement learning, an agent is interacting with the world over time. We teach it good behavior by providing it with rewards.

_{^{Image by Lily Weng}}

Applications of Reinforcement Learning¶

Applications of reinforcement learning include:

Creating agents that play games such as Chess or Go.
Indutrial control: automatically operating cooling systems in datacenters to use energy more efficiently.
Generative design of new drug compounds.

Artificial Intelligence and Deep Learning¶

Machine learning is closely related to these two fields.

AI is about building machines that exhibit intelligence.
ML enables machines to learn from experience, a useful tool for AI.
Deep learning focuses on a family of learning algorithms loosely inspired by the brain.

_{^{Image source.}}

Part 3: Logistics¶

We conclude the lecture with the logistical aspects of the course.

What Is the Course About?¶

This course studies the foundations and applications of machine learning.

Algorithms: We cover a broad set of ML algorithms: linear models, boosted decision trees, neural networks, SVMs, etc.
Foundations: We explain why they work using math. We cover maximum likelihood, generalization, regularization, etc.
Implementation: We teach how to implement algorithms from scratch using numpy or sklearn

We also cover many practical aspects of applying machine learning.

Course Contents¶

Some of the most important sets of topics we will cover include:

Basics of Supervised Learning: Regression, classification, overfitting, regularization, generative vs. discriminative models
Unsupervised Learning: Clustering, dimensionality reduction, etc.
Advanced Supervised Learning: Support vector machines, kernel methods, decision trees, boosting, deep learning.
Applying ML: Overfitting, error analyis, learning curves, etc.

Prerequisites: Is This Course For You?¶

The main requirements for this course are:

Programming: At least 1 year of experience, preferably in Python.
Linear Algebra: College-level familiarity with matrix operations, eigenvectors, the SVD, vector and matrix norms, etc.
Probability. College-level understanding of probability distributions, random variables, Bayes' rule, etc.

This course does not assume any prior ML experience.

Tutorials: Reviewing the Prerequisites¶

We will hold in-class tutorials next week on the following topics:

Linear algebra and probability
Python and Numpy tutorial

You are storngly encouraged to attend these!

Logistics¶

We meet Tue/Thu 11:40am-12:55pm in Bloomberg 131.

Class Webpage: https://canvas.cornell.edu/courses/57131
Teaching Assistants: Eliot Shekhtman (Head TA), Matthew Franchi, Shachi Deshpande.
Graders: Xinyue Cao, Hrudai Battini, Ananya Devarakonda, Yaxuan Huang
Office hours:
- Volodymyr: After class
- See website for TA office hours

We will use Canvas and Gradescope for the course.

Gradescope: download and submit all assignments there
- Click the "Gradescope" tab in Canvas
Canvas Discussions Use the forum to ask us questions online
- Click the "Discussions" tab in Canvas
Canvas Announcements: Make sure to regularly watch for updates
- Click the "Announcements" tab in Canvas

Lecture Slides¶

The slides for each lecture will be available online on Github:

This is the Github repo with the raw content https://github.com/kuleshov/cornell-cs5785-2023-applied-ml
We will also share lectures notes in HTML and PDF format
Look out for links and announcements on Canvas

Lecture Notes¶

Detailed lecture notes are available online as an HTML website.

These will be further edited over the course of the semester.

Lecture Videos¶

Lecture videos are available on Youtube. Use these to review the material.

These videos were originally recorded for the 2020 edition of the course.

Remote Attendence Policy¶

The class is fully in-person this year:

We will not be streaming lectures over Zoom (although there may be exceptions, e.g., bad weather).
Slides, lecture notes, and lecture videos from the 2020 edition of the course will be available online.
Some guest lectures and TA office hours (for Ithaca-based TAs) may be held over Zoom.

Grading¶

We will have four homeworks, a prelim, and a project.

Four homeworks: 10% each. Conceptual & programming questions
In-class prelim: 15%
Course project: 45%
- Proposal (2-3 paragraphs) : 5%
- Progress report (3-5 pages): 15%
- Final report (5 pages): 25%

You have six late days with a max of two per deliverable

Project¶

Course projects will be done in groups of up to 3 students and can fall into one or more of the following categories:

Application of machine learning on a novel task/dataset
Algorithmic improvements into the representation, learning, or evaluation of machine learning models
Theoretical analysis of any aspect of existing machine learning methods

You are encouraged to choose your project, but we will make suggestions.

Policy on Generative AI¶

You are allowed to use generative AI. If you choose to use it, you must add a statement explaining how you used it.

WARNING: Do not copy paste answers. In my experience >50% of ChatGPT answers to homework questions have major flaws.

Assignment Zero¶

We will be releasing a "practice" Assignment 0 on Gradescope today.

It does not count towards your grade.
It walks you though setting up your compute environment (Jupyter, scikit-learn, matplolib, etc.)
It is "due" in two weeks on Gradescope if you choose to do it.

In-Class Feedback System¶

We encourage you to submit (anonymous) feedback about how the course is going throughout the semester.

Use the Google form that we will post on Canvas.
Tell us what you think about assignments, the pace of the lectures, the due dates, project deliverables, etc.

Again, Welcome to Applied Machine Learning!¶

Software You Will Use¶

You will use Python and popular machine learning libraries such as:

scikit-learn. It implements most classical machine learning algorithms.
tensorflow, keras, pytorch. Standard libraries for modern deep learning.
numpy, pandas. Linear algebra and data processing libraries used to implement algorithms from scratch.

Executable Course Materials¶

The core materials for this course (including the slides!) are created using Jupyter notebooks.

We are going to embed an execute code directly in the slides and use that to demonstrate algorithms.
These slides can be downloaded locally and all the code can be reproduced.

In [29]:

                
                    Copied!
                    
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, neural_network
plt.rcParams['figure.figsize'] = [12, 4]
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, neural_network
plt.rcParams['figure.figsize'] = [12, 4]

We can use these libraries to load a simple datasets of handwritten digits.

In [7]:

                
                    Copied!
                    
                        
                        
                    
                    

            
# https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html
# load the digits dataset
digits = datasets.load_digits()

# The data that we are interested in is made of 8x8 images of digits, let's
# have a look at the first 4 images.
_, axes = plt.subplots(1, 4)
images_and_labels = list(zip(digits.images, digits.target))
for ax, (image, label) in zip(axes, images_and_labels[:4]):
    ax.set_axis_off()
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title('Label: %i' % label)
# https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html
# load the digits dataset
digits = datasets.load_digits()

# The data that we are interested in is made of 8x8 images of digits, let's
# have a look at the first 4 images.
_, axes = plt.subplots(1, 4)
images_and_labels = list(zip(digits.images, digits.target))
for ax, (image, label) in zip(axes, images_and_labels[:4]):
    ax.set_axis_off()
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title('Label: %i' % label)

We can now load and train this algorithm inside the slides.

In [30]:

                
                    Copied!
                    
                        
                        
                    
                    

            
np.random.seed(0)
# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
data = digits.images.reshape((len(digits.images), -1))

# create a small neural network classifier
from sklearn.neural_network import MLPClassifier
classifier = MLPClassifier(alpha=1e-3)

# Split data into train and test subsets
X_train, X_test, y_train, y_test = sk.model_selection.train_test_split(data, digits.target, test_size=0.5, shuffle=False)

# We learn the digits on the first half of the digits
classifier.fit(X_train, y_train)

# Now predict the value of the digit on the second half:
predicted = classifier.predict(X_test)
np.random.seed(0)
# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
data = digits.images.reshape((len(digits.images), -1))

# create a small neural network classifier
from sklearn.neural_network import MLPClassifier
classifier = MLPClassifier(alpha=1e-3)

# Split data into train and test subsets
X_train, X_test, y_train, y_test = sk.model_selection.train_test_split(data, digits.target, test_size=0.5, shuffle=False)

# We learn the digits on the first half of the digits
classifier.fit(X_train, y_train)

# Now predict the value of the digit on the second half:
predicted = classifier.predict(X_test)

We can now visualize the results.

In [31]:

                
                    Copied!
                    
_, axes = plt.subplots(1, 4)
images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))
for ax, (image, prediction) in zip(axes, images_and_predictions[:4]):
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title('Prediction: %i' % prediction)
_, axes = plt.subplots(1, 4)
images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))
for ax, (image, prediction) in zip(axes, images_and_predictions[:4]):
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title('Prediction: %i' % prediction)