Fundamentals of Scalable Data Science


Resource | v1 | created by coursera-bot |
Type Course
Created unavailable
Available at coursera.org/learn/ds
Identifier unavailable

Description

Apache Spark is the de-facto standard for large scale data processing. This is the first course of a series of courses towards the IBM Advanced Data Science Specialization. We strongly believe that is is crucial for success to start learning a scalable data science platform since memory and CPU constraints are to most limiting factors when it comes to building advanced machine learning models. In this course we teach you the fundamentals of Apache Spark using python and pyspark. We'll introduce Apache Spark in the first two weeks and learn how to apply it to compute basic exploratory and data pre-processing tasks in the last two weeks. Through this exercise you'll also be introduced to the most fundamental statistical measures and data visualization technologies. This gives you enough knowledge to take over the role of a data engineer in any modern environment. But it gives you also the basis for advancing your career towards data science.

Relations

Currently, no topics are attached.

supervised by IBM

IBM offers a wide range of technology and consulting services; a broad portfolio of middleware for co...

Currently, no resources are attached.


Edit resource New resource

0.0 /10
useless alright awesome
from 0 reviews
Write comment Rate resource Tip: Rating is anonymous unless you also write a comment.
Resource level 0.0 /10
beginner intermediate advanced
Resource clarity 0.0 /10
hardly clear sometimes unclear perfectly clear
Reviewer's background 0.0 /10
none basics intermediate advanced expert
Comments 0
Currently, there aren't any comments.