Introduction to Big Data and Teradata Aster

This a note for Teradata Aster Basics 6.10 Exam a.k.a TACP(Teradata Aster Certified Professional).

Recommended courses are followings and this note is for the 2nd course.

SQL vs SQL-MR: SQL is better for standard transformation. SQL-MR is better for custom transformation(e.g. log extraction)

R creates multiple copies of data during processing, and doesn’t automatically run in parallel. Aster R run in parallel across the Aster MPP architecture.
FSE(Foreign Server Encapsulation): Supports remote data platforms other than Aster and Teradata. (e.g. Oracle, Hadoop, DB2, etc)
QueryGrid Aster-Teradata: Join tables in Taeradata and Aster Database
QueryGrid Aster-Hadoop: Copy data from Hadoop to Aster, from Aster to Hadoop. HCatalog: Table metastore service for Hive, Pig, and so on.
Deployment Options: Aster Apliance, Cloud, Software Only(RHEL) and Aster on Hadoop.
Data Prepartion: IPGeo, Pivot, JsonParser, Apach Log Parser and PSTParserAFS

Aster Analytics Portfolio

Aster Database

Queen: Cluster Coordination, Distributed Query Planning, System Tables
Worker Node: Send back results to Queen
Loader: Loading data to Aster

Access Control

Multi-Version Concurrency Control(MVCC): Eliminate the needs of read locks while ensuring that the database maintains the key ACID(Atomicity, Consistency, Isolation, Durability)

Two Level Query Optimization

Dynamic Workload Management

nCluster’s columnar capability is a custom development of Aster. Not part of PostgreSQL. Columnar limitation is append only(no updates or deletes)

Columnar advantage and limitation

Three compression levels

Informatica has Aster connector. Others uses nCluster loader.

Aqua Data Studio:

Viewpoint portlet for Aster