Big Data Architecture and Application

COS20085

Duration
One Semester or equivalent

Prerequisites
COS10081 Introduction to Data Science AND
[COS20007 Object-Oriented Programming OR COS30016 Programming in Java]

Corequisites
Nil

Contact hours
4 Hours/Week

Credit Points
12.5

Aims and learning outcomes

This unit introduces students to Big Data management through Hadoop architecture and ecosystem of tools. These technologies are the foundation of Big Data analytics which facilitate scalable management and processing of vast quantities of data. The unit is delivered in collaboration with industry and prepares students with the essential foundations towards undertaking professional certifications.

Students who successfully complete this unit should be able to:

  1. Appreciate the architecture of Hadoop clusters at both hardware and system software levels.
  2. Demonstrate the skills in working with the open source Hadoop platform through the enterprise solutions.
  3. Apply Hadoop and other related Big data technologies such as MapReduce, Hive, Impala, Pig in developing analytics and solving common problems faced by enterprises today.

Unit information

Learning and teaching structures

2 hours lectures and 2 hours tutorial/laboratory per week.

In a Semester, you should normally expect to spend, on average, twelve and a half hours of total time (formal contact time plus independent study time) a week on a 12.5 credit point unit of study.

Content

  • Introduction to Hadoop
  • Managing the Hadoop ecosystem
  • Writing MapReduce programs
  • Hadoop API
  • Development tips and techniques in Hadoop
  • Partitioners and Reducers
  • MapReduce algorithms
  • Data acquisition and workflow: Sqoop, Flume, Oozie
  • Data analysis with Pig
  • Data management and text processing with Hive

General skills outcomes

Key Generic Skills:

  • Technical competence
  • Problem solving skills
  • Analysis skills
  • Teamwork skills
  • Ability to tackle unfamiliar problems
  • Ability to work independently

Assessment

  1. Laboratory tasks (Individual) 40%
  2. Assignment 1 (Group) 20%
  3. Assignment 2 (Individual) 20%
  4. Test (Individual) 20%

Minimum requirements to pass this unit of study

In order to achieve a pass in this unit of study, you must:

  • achieve an aggregate mark for the subject of 50% or more, and
  • pass the Test

Students who do not achieve at least 50% for the Test after a second attempt will receive a maximum of 44% as the total mark for the unit and will not be eligible for a conceded pass.

Study Resources

Resources and reference material

A list of reading materials and/or required texts will be made available in the Unit Outline.