What is ETL / ELT?
ETL (extract, transform, load) and ELT (extract, load, transform) are processes used to extract data from source systems, prepare it and load it into a target platform. Both consist of the same three steps — extract, transform and load; the difference lies solely in their order: with ETL the data is transformed before loading, with ELT only after loading in the target platform.
Also known as: extract transform load · extract load transform · data integration · data pipeline
Where ETL / ELT is used
ETL and ELT processes bring data from operational systems such as ERP, CRM, databases, Excel or APIs into a central analytics platform such as a data warehouse or lakehouse. In doing so, data is extracted, cleansed, harmonized and brought into an analysis-friendly structure. Such processes usually run automatically and regularly (for example nightly or continuously).
In classic ETL, the transformation happens in a separate processing step before the data is loaded. In the more modern ELT, raw data is loaded first and then transformed using the compute power of the target platform, which is often more efficient and flexible in cloud lakehouses such as Azure Databricks or Microsoft Fabric.
Typical use cases
ETL and ELT pipelines are needed wherever data from several systems is consolidated and prepared for analysis.
- Daily or continuous loading of a data warehouse or lakehouse
- Consolidating ERP, CRM, Excel and API data into one platform
- Cleansing and harmonization as the basis for reliable reporting
- Building the layers of a medallion architecture (bronze, silver, gold)
How it relates & how smiit uses it
ETL/ELT is the process that fills a data platform, not the platform itself (data warehouse or lakehouse) and not the modeling (Inmon, Kimball, Data Vault). Power Query is a lightweight form of ETL within Power BI; for larger volumes, dedicated ELT pipelines are preferred. The medallion architecture is a common way to structure ELT in the lakehouse. In the dy Project AG data platform, data from SQL Server, Excel and REST APIs was integrated via ELT pipelines on Azure Databricks. smiit builds ETL/ELT processes that are reliable, traceable and maintainable.
Common mistakes & misconceptions
- ETL and ELT differ in more than letter order; with ELT raw data is loaded first and transformed inside the target system, which changes how it scales.
- Many believe ELT makes ETL obsolete, but both approaches remain valid depending on data volume, target system and governance needs.
- A common error is to think the real effort is in loading. Most complexity actually lies in transformation, data quality and error handling.
Frequently asked questions
What is the difference between ETL and ELT?
With ETL data is transformed before loading, with ELT only afterwards in the target platform. ELT uses the compute power of modern cloud platforms and is often more flexible and efficient with large data volumes.
Do we need special tools for ETL / ELT?
For small cases, Power Query in Power BI is often enough. For larger data volumes, platforms such as Azure Databricks, Microsoft Fabric or Azure Data Factory are used, which smiit selects to suit the data situation.
How often should ETL / ELT pipelines run?
That depends on how up to date the analyses need to be. Common patterns are a nightly load, several times a day or near-continuous processing; the higher the frequency, the more important reliable error handling and monitoring become.
What happens if an ETL / ELT pipeline fails?
Well-built pipelines log errors, can retry individual steps in a targeted way and should be designed so that a rerun produces no duplicate or inconsistent data (idempotency). Monitoring and alerts ensure that problems are noticed early.
Related terms
Sources & further reading
Want to put this topic to work in your company?
Updated · Back to the glossary