Northeastern University
Machine Learning with Small Data Part 1

Early bird sale! Unlock 10,000+ courses from Google, Microsoft, and more for £160/year. Save now.

Northeastern University

Machine Learning with Small Data Part 1

Sarah Ostadabbas

Instructor: Sarah Ostadabbas

Included with Coursera Plus

Gain insight into a topic and learn the fundamentals.
1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Gain insight into a topic and learn the fundamentals.
1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

June 2025

Assessments

8 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 7 modules in this course

In this module, we will explore the pivotal role of data as the foundation for machine learning algorithms. We begin by discussing the significance of large datasets in training deep learning models as these datasets are crucial for the models’ successful application and effectiveness. We will also delve into the challenges associated with small datasets, particularly in sensitive fields such as healthcare and defense, where data acquisition is often difficult, costly, or subject to stringent privacy and security regulations. To address these challenges, the course will introduce various strategies for making the most of limited data, including data-efficient machine learning techniques and the use of synthetic data augmentation. Additionally, we will present the course structure and discuss a curated selection of research papers that align with and enrich our course topics.

What's included

2 videos13 readings1 assignment

In this module, we will delve into the core aspects of machine learning with a focus on the importance of data, particularly in deep learning applications. We start by emphasizing how large datasets are essential for training deep learning models effectively, as they enable the models to capture and learn from complex patterns, improving their overall performance. Additionally, we'll explore the intersection of data availability, computational power, and model capacity, highlighting how these elements interact to refine model accuracy and efficiency. Furthermore, the module will cover computing advancements beyond Moore's Law and their impact on machine learning, illustrating how modern hardware like CPUs, GPUs, and TPUs enhance computational capabilities critical for training sophisticated models. We'll also delve into scaling laws in deep learning, discussing empirical findings that show how model performance improves predictably with increases in dataset size and model complexity, although with diminishing returns. To provide a deeper theoretical foundation, we'll examine the Vapnik-Chervonenkis (VC) theory, which offers insights into how learning curves and model complexity relate to a model’s ability to generalize from training data. This discussion will extend to practical applications and theoretical limitations, helping to frame machine learning challenges in terms of data sufficiency, model fitting, and the balance between bias and variance. By the end of this module, students will have a thorough understanding of the dynamic interplay between these factors and their implications for machine learning practice and research.

What's included

1 video19 readings2 assignments1 app item

In this module, we’ll explore transfer learning and its role in data-efficient machine learning, where models leverage knowledge from previous tasks to improve performance on new, related tasks. We’ll also cover various types of transfer learning, including transductive, inductive, and unsupervised methods, each addressing different challenges and applications. We’ll discuss some practical steps for implementing transfer learning, such as selecting and fine-tuning pre-trained models, to reduce reliance on large datasets. We’ll also examine data-driven and physics-based simulations for data augmentation, highlighting their use in enhancing training under constrained conditions. Finally, we’ll review key papers on transfer learning techniques to address data scarcity and improve model performance.

What's included

1 video15 readings1 assignment

In this module, you'll explore the concept of domain adaptation, a key aspect of transductive transfer learning. Domain adaptation helps you train models that perform well on a target domain, even when its data distribution differs from the source domain. You'll learn about the challenges of domain shift and labeled data scarcity and how these can impact model performance. We'll cover different types of domain adaptation, including unsupervised, semi-supervised, and supervised approaches. You'll also dive into techniques like Deep Domain Confusion (DDC), which integrates domain confusion loss into neural networks to create domain-invariant features. Additionally, you'll discover advanced methods such as Domain-Adversarial Neural Networks (DANNs), Correlation Alignment (CORAL), and Deep Adaptation Networks (DANs) that build on DDC to enhance domain adaptation by aligning feature distributions and capturing complex dependencies across network layers.

What's included

1 video10 readings1 assignment

In this module, we’ll explore weak supervision, a technique for training machine learning models with limited, noisy, or imprecise labels. You'll learn about different types of weak supervision and why they are crucial in small data domains. We’ll cover techniques such as semi-supervised learning, self-supervised learning, and active learning, along with advanced methods such as Temporal Ensembling and the Mean Teacher approach. Additionally, you'll discover Bayesian deep learning and active learning strategies to improve training efficiency. Finally, you'll see real-world applications in fields like medical imaging, NLP, fraud detection, autonomous driving, and biology.

What's included

1 video8 readings1 assignment

In this module, you'll explore how Zero-Shot Learning (ZSL) enables models to recognize new categories without having seen any examples of those categories during training. This is achieved by leveraging intermediate semantic descriptions, such as attributes, shared between seen and unseen classes. You'll also learn about the importance of regularization in preventing overfitting and improving generalization, as well as how generative models like GANs and VAEs enhance ZSL by synthesizing unseen class data. Additionally, we'll examine Generalized Zero-Shot Learning (GZSL), which tests models on both seen and unseen classes, making the task more challenging and realistic. By the end of this module, you'll have a solid understanding of how ZSL and its extensions can be applied to various machine learning tasks.

What's included

1 video9 readings1 assignment

This module focuses on Few-Shot Learning (FSL), a critical paradigm in machine learning that enables models to classify new examples with only a small number of labeled instances. Unlike traditional deep learning models that require vast amounts of labeled data, FSL mimics the human ability to generalize from limited examples, making it highly useful for tasks like image classification, object detection, and natural language processing (NLP). The lecture introduces Matching Networks, a metric-based learning approach designed to solve one-shot learning problems by learning a similarity function that maps new examples to previously seen labeled instances. Students will gain an in-depth understanding of how nearest-neighbor approaches, differentiable embedding functions, and attention mechanisms help in optimizing few-shot learning models. Through discussions, theoretical formulations, and real-world applications, this lecture equips students with practical insights into how AI can function effectively in data-scarce environments.

What's included

1 video7 readings1 assignment

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Sarah Ostadabbas
Northeastern University
1 Course2 learners

Offered by

Explore more from Machine Learning

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions