Northeastern University
Data Management for Analytics Part 1
Northeastern University

Data Management for Analytics Part 1

Xuemin Jin

Instructor: Xuemin Jin

Included with Coursera Plus

Gain insight into a topic and learn the fundamentals.
9 hours to complete
Flexible schedule
Learn at your own pace
Build toward a degree
Gain insight into a topic and learn the fundamentals.
9 hours to complete
Flexible schedule
Learn at your own pace
Build toward a degree

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

August 2025

Assessments

29 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 7 modules in this course

In this module, we will introduce the fundamental concepts of database management, review applications of database technology, and define key concepts. We will also contrast the file-based approach to data management with the database approach. Finally, we will examine the elements of a database system and the advantages of database design.

What's included

3 videos7 readings4 assignments

In this module, we take a quick look at what is under the hood of a database management system. We will examine the key components of DBMS architecture and how these components work together for data storage, processing, and management. We also check how DBMSs can be categorized based on data models, degree of simultaneous access, architecture, and usage.

What's included

3 readings3 assignments

In this module, we first review the database design process from conceptual and logical to physical database design and elaborate on the data requirements of a business process. We then introduce the Entity Relationship (ER) model for conceptual data modeling. The fundamental building blocks of the ER model include entity types, attribute types, and relationship types. We discuss attribute type details such as domains, key attribute types, simple versus composite attribute types, single-valued versus multi-valued attribute types, and derived attribute types. For relationship types, we also examine the degree and roles, cardinalities, weak entity types, and ternary relationship types. Various examples are included for clarification.

What's included

1 video5 readings4 assignments

In this module, we will learn three additional semantic data modeling concepts: specialization/generalization, categorization, and aggregation. These concepts enhance and extend the ER model discussed in the previous module. We will introduce an alternative conceptual model: the Unified Modeling Language (UML) class diagram. The UML is a modeling language that assists in the specification, visualization, construction, and documentation of artifacts of a software system. The UML can offer case diagrams, sequence diagrams, package diagrams, deployment diagrams, etc. Here we use the UML for conceptual data modeling.

What's included

1 video3 readings3 assignments

In this module, we focus on some organizational aspects of data management, including the DBMS catalog, the roles of metadata, and metadata modeling. We also discuss data quality, data governance, and different roles in data management. By the end of this module, you will understand the proper management of data and the corresponding data definitions. Data management entails proper management of data and the corresponding data definitions or metadata. The objective of data management is to ensure that (meta-)data is of good quality, and thus a key resource, for data analytics tasks and effective and efficient managerial decision-making.

What's included

5 readings5 assignments

As discussed in the previous modules, designing a database takes multiple steps. Once the conceptual data model is finalized, the next step is to map the conceptual data model to a logical data model by the database designer during the logical design step. Note that, unlike the conceptual data model, the logical data model is associated with the data model used by the implementation DBMS environment. In other words, a logical data model is intended for a specific type of DBMS. Since the top ten DBMSs in use are usually dominated by relational DBMSs such as Oracle, MySQL (open-source), Microsoft SQL Server, etc., we will focus on the relational model that can be used as a logical data model for relational DBMSs. The relational model is a formal data model with a sound mathematical foundation, based on set theory and first-order predicate logic. Unlike the ER and EER models, the relational model has no standard graphical representation, which makes it unsuitable as a conceptual data model. Given its solid theoretical underpinning, the relational model is commonly adopted to build both logical and internal data models. In this module, we are concerned with the definitions of relational models that can be used as a logical data model and/or an internal model for relational DBMSs such as Oracle and Microsoft SQL servers. The relational model is introduced as a formal data model. Different types of keys are defined, and their roles are specified along with relational constraints. Students will learn the relational model as a logical data model. The mapping of a conceptual ER model to a relational model is explained in detail, including the mapping of entity types, binary one-to-one relationship types, binary one-to-many relationship types, binary many-to-many relationship types, unary relationship types, n-nary relations types, multi-valued attribute types, and weak entity types.

What's included

2 videos7 readings5 assignments

This module first presents an overview of the insertion, deletion, and update anomalies in an unnormalized relational model and discusses informal normalization guidelines. Two key concepts used in the normal forms are defined and examined: functional dependency and prime attribute type along with various special cases of function dependency, including full versus partial, transitive, trivial, and multivalued dependencies. The process and the formal procedures for the normalization of a relational model are discussed in detail via the first normal form (1 NF), the second normal form (2 NF), the third normal form (3 NF), the Boyce-Codd normal form (BCNF), and the fourth normal form (4 NF).

What's included

1 video6 readings5 assignments

Build toward a degree

This course is part of the following degree program(s) offered by Northeastern University . If you are admitted and enroll, your completed coursework may count toward your degree learning and your progress can transfer with you.¹

 

Instructor

Xuemin Jin
Northeastern University
4 Courses618 learners

Offered by

Explore more from Software Development

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions