Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Workshop: Data Literacy

Using Data To Make Intelligent Decisions

About this Workshop

Welcome to this workshop on Data Literacy - Using Data To Make Intelligent Decisions. In this workshop, you'll learn how to apply a rigorous analysis process and work with various data tools to make good decisions from authoritative data.

You'll start by developing skills to find the most authoritative data, understand and use data tools including spreadsheets, databases, and programming languages, analyze the data in context, and use the data for intelligent decisions.

This README.MD file explains how the workshop is laid out, what you will learn, and the technologies you will use in this solution.

You can view all of the [source files for this workshop on this github site, along with other presentations and workshops. Open this link in a new tab to find out more.

Learning Objectives

In this workshop you'll learn how to:

  • Find and source the most authoritative data
  • Work with data tools
  • Analyze data in context
  • Create analytic results from data

Business Applications of this Workshop

Businesses require a high level of data literacy in every role within an organization, and it is assumed these skills are gained in primary or secondary education. However, many of these skills are not given the proper amount of time or consideration during this phase of training, and students are often left to fill in the gaps on their own.

This workshop provides a prescriptive methodology and references to learn the basics and go much deeper into each of the data literacy topics than primary education provides, and allows a modular approach to learning. You're able to move through the workshop quickly over areas you already know, and take your time on the areas you need to know more about.

Technologies used in this Workshop

The solution includes the following technologies - although you are not limited to these, they form the basis of the workshop. At the end of the workshop you will learn how to extrapolate these components into other solutions. You will cover these at an overview level, with references to much deeper training provided.

Technology Description
SpreadsheetsUnderstanding tabular data manipulation and reporting is best done using an electronic spreadsheet - you'll use one in this workshop to create, edit, and explore data, and you'll learn to create reports with a spreadsheet.
The SQL Programming LanguageThe Structured Query Language provides an effective query system to manipulate data and is used in many applications.
The R PlatformThe R Programming Language and Platform is a data-first programming language based around functions, and has many libraries for working in almost any data domain. It is also used in Data Science applications.
The Python Programming LanguageThe Python programming language is fast becoming a default data programming language, with many packages and functions available for almost any data domain. It is also highly used in Data Science projects.
Relational Database Management SystemsSpreadsheets and other tools are often "single-seat" based data storage systems, designed for use by one person at a time. They also do not process data, but merely hold the data and allow you to perform functions on them. A Relational Database Management System is an engine that runs on a remote system allowing multiple users to access, update and delete tabular data in a consistent, high-performing process.
NoSQL Database PlatformsExtremely large amounts of data processing can cause issues with a purely tabular, or relational structure. Multiple systems have evolved to solve the data access at scale problem, known collectively as "Not Only SQL" (NoSQL) platforms.
Data Science Tools and PlatformsData Science is an umbrella term used to describe the tools, methods and processes to create predictions, clusters and other forms of prescriptive analysis over data.

Before Taking this Workshop

You'll need a local system that you are able to install software on. The workshop demonstrations use Microsoft Windows as an operating system and all examples use Microsoft Windows for the workshop. Optionally, you can use a Microsoft Azure Virtual Machine (VM) to install the software on and work with the solution.

You must have a high-school equivalent or higher educational background in multiple topics for this Workshop. If you do not, you can access the pre-requisites reference for links to assist you in learning those topics.


A full pre-requisites document is located here. These instructions should be completed before the workshop starts, since you will not have time to cover these in class.

You will need a personal computer to complete the exercises, or you can use a Virtual Machine if you like. Remember to turn off any Virtual Machines from the Azure Portal if you use one when not taking the class so that you do incur charges (shutting down the machine in the VM itself is not sufficient).

Workshop Details

This workshop uses various data technologies and languages, with a focus on creating decisions from data using various architectures and implementations, development languages and platforms.

Primary Audience:Professionals tasked with data analysis
Secondary Audience: Students new to the data analysis discipline who wish to learn more about the processes and tools use in that field
Level: 200-400
Type:In-Person, or from github
Length: 4-8

Related Workshops

Workshop Modules

This is a modular workshop, and in each section, you'll learn concepts, technologies and processes to help you complete the solution.

01 - Find Authoritative Data In this Module you'll learn more about finding data sources and using the most authoritative data in your analysis.
02 - Work with Data Tools You can work with data using only your mind, or pencil and paper. In fact, while you’re learning, using basic resources like these can be optimal. But soon you will find that you need more powerful tools. This module - the longest and most complicated topic - will cover the major tools from the simple to the complex.
03 - Analyze Data In Context Analyzing data at its simplest means looking at the data, from multiple perspectives and using multiple tools, in context. The specific process you’ll follow to analyze a given data set largely depends on the goal of the analysis, the type of data, and the area of analysis (such as business, science, etc.). This Module covers that process.
04 - Use Data for Intelligent Decisions Applying your analysis to creating intelligent decisions is the goal of data literacy. This module covers that process, and explains how to take your analysis and form a course of action.

Next Steps

Next, Continue to Pre-Requisites