Posts

Build Data Lake for Enterprise

Image
  Comparison: Data Warehouse and Data Lake Structured Data Unstructured Data Data Lake (Raw Data) Reporting Analytics Process Data Sources Data Lake Architecture Analytics Reporting Process Structured Data Reporting Analytics (Extract Transform Load) ETL Data Warehouse (Metadata + Summary + Raw Data) Data Sources Data Warehouse Architecture The comparison between a data warehouse and a data lake is as follows: The figure represents the process of data collection, storage, and reporting in Big Data. Data Warehouse Data Lake Data Structure and transformed Structure/semi-structure, processed/raw Working Structure-ingest-analyze Ingest-analyze-understand Processing Schema-on-write Schema-on-read Storage Expensive when volumes are high Built for low cost storage Flexibility Fixed configuration, not very flexible No particular structure, configure and reconfigure as per your requirements Cost/Efficiency Efficiently uses CPU/IO Efficiently uses storage and processing capabilities at very low