課程目錄: 用Scala和Spark進(jìn)行大數(shù)據(jù)分析培訓(xùn)

4401 人關(guān)注
(78637/99817)
課程大綱:

用Scala和Spark進(jìn)行大數(shù)據(jù)分析培訓(xùn)

 

 

 

WEEK 1

Getting Started + Spark Basics

Get up and running with Scala on your computer.

Complete an example assignment to familiarize yourself with our unique way of submitting assignments.

In this week, we'll bridge the gap between data parallelism

in the shared memory scenario (learned in the Parallel Programming course, prerequisite)

and the distributed scenario. We'll look at important concerns that arise in distributed systems,

like latency and failure. We'll go on to cover the basics of Spark,

a functionally-oriented framework for big data processing in Scala.

We'll end the first week by exercising what we learned about Spark

by immediately getting our hands dirty analyzing a real-world data set.

WEEK 2

Reduction Operations & Distributed Key-Value Pairs

This week, we'll look at a special kind of RDD called pair RDDs.

With this specialized kind of RDD in hand, we'll cover essential operations on large data sets,

such as reductions and joins.WEEK 3

Partitioning and Shuffling

This week we'll look at some of the performance implications of using operations like joins.

Is it possible to get the same result without having to pay for the overhead of moving data over the network?

We'll answer this question by delving into how we can partition our data to achieve better data locality,

in turn optimizing some of our Spark jobs.WEEK 4

Structured data: SQL, Dataframes, and Datasets

With our newfound understanding of the cost of data movement

in a Spark job, and some experience optimizing jobs for data locality last week,

this week we'll focus on how we can more easily achieve similar optimizations.

Can structured data help us? We'll look at Spark SQL and its powerful optimizer which uses structure

to apply impressive optimizations. We'll move on to cover DataFrames and Datasets,

which give us a way to mix RDDs with the powerful automatic optimizations behind Spark SQL.


 

主站蜘蛛池模板: 国产成人av在线影院| 天天射天天干天天插| 人人妻人人澡人人爽欧美精品| 黄色三级电影免费| 尤物在线影院点击进入| 亚洲国语在线视频手机在线| 三级4级做a爰60分钟| 欧美成人猛男性色生活| 国产乱人伦精品一区二区| sss视频在线精品| 欧美fxxx性| 卡1卡2卡3卡4卡5免费视频| 91大神福利视频| 无限看片在线版免费视频大全| 亚洲第一页在线视频| 青青青青青草原| 大学生粉嫩无套流白浆| 中国xxx69视频| 最近免费中文字幕中文高清| 动漫美女被到爽流触手| 蜜臀精品无码av在线播放| 国产高清自产拍av在线| 久久久久久一品道精品免费看| 欧美kkk4444在线观看| 亚洲欧美天堂网| 色吧亚洲欧美另类| 国产啪精品视频网站免费尤物| AV无码免费一区二区三区| 日本公与熄乱理在线播放370| 二代妖精免费看| 欧美一级美片在线观看免费| 八戒久久精品一区二区三区| 国产一区二区三区夜色| 国产精品白丝喷水在线观看| 99久久婷婷国产综合精品| 女人是男人的未来的人| 一本色道久久88综合日韩精品| 手机看片国产免费永久| 久久这里精品国产99丫e6| 深夜在线观看网站| 国产90后美女露脸在线观看|