Initial commit: IHK Ausbildung materials
This commit is contained in:
32
2-Ausbildungsjahr/LF8-Datenintegration/LF8-00-Übersicht.md
Normal file
32
2-Ausbildungsjahr/LF8-Datenintegration/LF8-00-Übersicht.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Lernfeld 8: Daten systemübergreifend bereitstellen
|
||||
|
||||
## Übersicht
|
||||
|
||||
Dieses Lernfeld behandelt die Integration von Daten aus verschiedenen Quellen.
|
||||
|
||||
## Themen
|
||||
|
||||
| Nr. | Thema | Beschreibung |
|
||||
|-----|-------|-------------|
|
||||
| 8.1 | [[LF8-01-Datenquellen|Datenquellen]] | Datenbanken, APIs, Dateien |
|
||||
| 8.2 | [[LF8-02-Schnittstellen|Schnittstellen]] | REST, SOAP, GraphQL |
|
||||
| 8.3 | [[LF8-03-Datenformate|Datenformate]] | JSON, XML, CSV |
|
||||
| 8.4 | [[LF8-04-ETL-Prozesse|ETL-Prozesse]] | Extrahieren, Transformieren, Laden |
|
||||
|
||||
## Lernziele
|
||||
|
||||
- Daten aus verschiedenen Quellen integrieren
|
||||
- Schnittstellen entwickeln und nutzen
|
||||
- Datenformate konvertieren
|
||||
- ETL-Prozesse implementieren
|
||||
|
||||
---
|
||||
|
||||
## Querverweise
|
||||
|
||||
- [[LF7-04-Kommunikation|Zurück: Kommunikation]]
|
||||
- [[LF9-Netzwerke-Dienste|Nächstes Lernfeld: Netzwerke und Dienste]]
|
||||
|
||||
---
|
||||
|
||||
*Stand: 2024*
|
||||
130
2-Ausbildungsjahr/LF8-Datenintegration/LF8-01-Datenquellen.md
Normal file
130
2-Ausbildungsjahr/LF8-Datenintegration/LF8-01-Datenquellen.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# 8.1 Datenquellen
|
||||
|
||||
## Arten von Datenquellen
|
||||
|
||||
### Datenbanken
|
||||
|
||||
```
|
||||
Datenbank-Typen
|
||||
├── Relational (MySQL, PostgreSQL, Oracle)
|
||||
├── NoSQL (MongoDB, Cassandra)
|
||||
├── Graph (Neo4j)
|
||||
└── Zeitreihen (InfluxDB)
|
||||
```
|
||||
|
||||
### Dateisysteme
|
||||
|
||||
| Format | Typ | Einsatz |
|
||||
|--------|-----|----------|
|
||||
| CSV | Text | Tabellarische Daten |
|
||||
| JSON | Text | Strukturierte Daten |
|
||||
| XML | Text | Konfiguration, Austausch |
|
||||
| Excel | Binär | Tabellenkalkulation |
|
||||
| PDF | Binär | Dokumente |
|
||||
|
||||
### Externe Quellen
|
||||
|
||||
```
|
||||
Externe Datenquellen
|
||||
├── APIs (REST, SOAP)
|
||||
├── Webservices
|
||||
├── Cloud-Speicher
|
||||
├── Stream-Daten (IoT)
|
||||
└── Drittanbieter
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Datenbank-Anbindung
|
||||
|
||||
### JDBC (Java)
|
||||
|
||||
```java
|
||||
// Verbindung zu Datenbank
|
||||
Connection conn = DriverManager.getConnection(
|
||||
"jdbc:mysql://localhost:3306/datenbank",
|
||||
"benutzer",
|
||||
"passwort"
|
||||
);
|
||||
|
||||
// Abfrage
|
||||
Statement stmt = conn.createStatement();
|
||||
ResultSet rs = stmt.executeQuery("SELECT * FROM tabelle");
|
||||
```
|
||||
|
||||
### Python (SQLAlchemy)
|
||||
|
||||
```python
|
||||
from sqlalchemy import create_engine
|
||||
|
||||
engine = create_engine(
|
||||
'postgresql://user:password@localhost:5432/database'
|
||||
)
|
||||
|
||||
# Daten lesen
|
||||
df = pd.read_sql("SELECT * FROM tabelle", engine)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dateien einlesen
|
||||
|
||||
### CSV
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
# CSV einlesen
|
||||
df = pd.read_csv('daten.csv')
|
||||
|
||||
# Mit Trennzeichen
|
||||
df = pd.read_csv('daten.txt', sep='\t')
|
||||
```
|
||||
|
||||
### JSON
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
# JSON laden
|
||||
with open('daten.json') as f:
|
||||
daten = json.load(f)
|
||||
|
||||
# Zu DataFrame
|
||||
df = pd.json_normalize(daten)
|
||||
```
|
||||
|
||||
### Excel
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
# Excel einlesen
|
||||
df = pd.read_excel('daten.xlsx')
|
||||
|
||||
# Bestimmtes Sheet
|
||||
df = pd.read_excel('daten.xlsx', sheet_name='Tabelle1')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Datenbanktypen im Vergleich
|
||||
|
||||
| Kriterium | Relational | NoSQL |
|
||||
|-----------|-----------|-------|
|
||||
| Schema | Fest | Flexibel |
|
||||
| Skalierung | Vertikal | Horizontal |
|
||||
| Transaktionen | ACID | Eventual Consistency |
|
||||
| Komplexität | Mittel | Niedrig |
|
||||
| Beispiele | MySQL, PostgreSQL | MongoDB, Redis |
|
||||
|
||||
---
|
||||
|
||||
## Querverweise
|
||||
|
||||
- [[LF8-02-Schnittstellen|Nächstes Thema: Schnittstellen]]
|
||||
- [[LF3-Datenbanken|Datenbanken]]
|
||||
|
||||
---
|
||||
|
||||
*Stand: 2024*
|
||||
185
2-Ausbildungsjahr/LF8-Datenintegration/LF8-02-Schnittstellen.md
Normal file
185
2-Ausbildungsjahr/LF8-Datenintegration/LF8-02-Schnittstellen.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# 8.2 Schnittstellen
|
||||
|
||||
## API-Typen
|
||||
|
||||
### REST (Representational State Transfer)
|
||||
|
||||
```
|
||||
REST-Prinzipien
|
||||
├── Ressourcen-orientiert
|
||||
├── Stateless
|
||||
├── Einheitliche Schnittstelle
|
||||
├── Client-Server
|
||||
└── Cache-fähig
|
||||
```
|
||||
|
||||
### REST-Endpunkte
|
||||
|
||||
```
|
||||
Beispiel: Benutzer-API
|
||||
|
||||
GET /api/benutzer → Alle Benutzer
|
||||
GET /api/benutzer/123 → Ein Benutzer
|
||||
POST /api/benutzer → Erstellen
|
||||
PUT /api/benutzer/123 → Aktualisieren
|
||||
DELETE /api/benutzer/123 → Löschen
|
||||
```
|
||||
|
||||
### REST-Beispiel
|
||||
|
||||
```javascript
|
||||
// Express.js Server
|
||||
const express = require('express');
|
||||
const app = express();
|
||||
|
||||
app.get('/api/benutzer/:id', async (req, res) => {
|
||||
const benutzer = await findBenutzer(req.params.id);
|
||||
res.json(benutzer);
|
||||
});
|
||||
|
||||
app.post('/api/benutzer', async (req, res) => {
|
||||
const neuerBenutzer = await createBenutzer(req.body);
|
||||
res.status(201).json(neuerBenutzer);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## SOAP (Simple Object Access Protocol)
|
||||
|
||||
### Unterschiede zu REST
|
||||
|
||||
| Kriterium | REST | SOAP |
|
||||
|-----------|------|------|
|
||||
| Format | JSON, XML | Nur XML |
|
||||
| Transport | HTTP | HTTP, SMTP |
|
||||
| Stateless | Ja | Optional |
|
||||
| Größe | Klein | Groß |
|
||||
|
||||
### SOAP-Beispiel
|
||||
|
||||
```xml
|
||||
<?xml version="1.0"?>
|
||||
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
|
||||
<soap:Header/>
|
||||
<soap:Body>
|
||||
<GetUserRequest xmlns="http://beispiel.de/api">
|
||||
<UserId>123</UserId>
|
||||
</GetUserRequest>
|
||||
</soap:Body>
|
||||
</soap:Envelope>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## GraphQL
|
||||
|
||||
### Konzepte
|
||||
|
||||
```
|
||||
GraphQL - Besonderheiten
|
||||
├── Eine Anfrage, viele Daten
|
||||
├── Client definiert benötigte Felder
|
||||
├── Stark typisiert (Schema)
|
||||
└── Keine Over- oder Underfetching
|
||||
```
|
||||
|
||||
### GraphQL-Beispiel
|
||||
|
||||
```graphql
|
||||
# Anfrage
|
||||
query {
|
||||
user(id: "123") {
|
||||
name
|
||||
email
|
||||
orders {
|
||||
id
|
||||
total
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Antwort
|
||||
{
|
||||
"data": {
|
||||
"user": {
|
||||
"name": "Max",
|
||||
"email": "max@example.com",
|
||||
"orders": [
|
||||
{ "id": "1", "total": 99.99 }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API-Design Best Practices
|
||||
|
||||
### URL-Design
|
||||
|
||||
```
|
||||
Richtlinien
|
||||
├── Ressourcen plural (benutzer nicht benutzer)
|
||||
├── Bindestrich für Wörter (benutzer-profil)
|
||||
├── Kleinbuchstaben
|
||||
├── Keine Verben im Pfad
|
||||
└── Query-Parameter für Filter
|
||||
```
|
||||
|
||||
### HTTP-Statuscodes
|
||||
|
||||
| Code | Bedeutung | Verwendung |
|
||||
|------|-----------|------------|
|
||||
| 200 | OK | Erfolgreich |
|
||||
| 201 | Created | Erstellt |
|
||||
| 400 | Bad Request | Ungültige Anfrage |
|
||||
| 401 | Unauthorized | Nicht angemeldet |
|
||||
| 404 | Not Found | Nicht gefunden |
|
||||
| 500 | Server Error | Fehler |
|
||||
|
||||
---
|
||||
|
||||
## API-Dokumentation
|
||||
|
||||
### OpenAPI (Swagger)
|
||||
|
||||
```yaml
|
||||
openapi: 3.0.0
|
||||
info:
|
||||
title: Meine API
|
||||
version: 1.0.0
|
||||
paths:
|
||||
/benutzer:
|
||||
get:
|
||||
summary: Alle Benutzer
|
||||
responses:
|
||||
'200':
|
||||
description: Erfolgreich
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: array
|
||||
items:
|
||||
$ref: '#/components/schemas/Benutzer'
|
||||
```
|
||||
|
||||
### Tools
|
||||
|
||||
| Tool | Beschreibung |
|
||||
|------|-------------|
|
||||
| Swagger UI | Interaktive Dokumentation |
|
||||
| Postman | API-Testing |
|
||||
| Insomnia | API-Client |
|
||||
|
||||
---
|
||||
|
||||
## Querverweise
|
||||
|
||||
- [[LF8-01-Datenquellen|Zurück: Datenquellen]]
|
||||
- [[LF8-03-Datenformate|Nächstes Thema: Datenformate]]
|
||||
|
||||
---
|
||||
|
||||
*Stand: 2024*
|
||||
199
2-Ausbildungsjahr/LF8-Datenintegration/LF8-03-Datenformate.md
Normal file
199
2-Ausbildungsjahr/LF8-Datenintegration/LF8-03-Datenformate.md
Normal file
@@ -0,0 +1,199 @@
|
||||
# 8.3 Datenformate
|
||||
|
||||
## JSON (JavaScript Object Notation)
|
||||
|
||||
### Grundstruktur
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "Max Mustermann",
|
||||
"alter": 25,
|
||||
"adresse": {
|
||||
"stadt": "Berlin",
|
||||
"plz": "10115"
|
||||
},
|
||||
"hobbys": ["Lesen", "Programmieren"],
|
||||
"aktiv": true
|
||||
}
|
||||
```
|
||||
|
||||
### Datentypen
|
||||
|
||||
| Typ | Beispiel |
|
||||
|-----|----------|
|
||||
| String | "Hallo" |
|
||||
| Number | 42, 3.14 |
|
||||
| Boolean | true, false |
|
||||
| Array | [1, 2, 3] |
|
||||
| Object | {"key": "value"} |
|
||||
| Null | null |
|
||||
|
||||
### JSON in Python
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
# String zu Dictionary
|
||||
daten = json.loads('{"name": "Max"}')
|
||||
|
||||
# Dictionary zu String
|
||||
text = json.dumps(daten, indent=2)
|
||||
|
||||
# Mit Datei
|
||||
with open('daten.json', 'w') as f:
|
||||
json.dump(daten, f, indent=2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## XML (eXtensible Markup Language)
|
||||
|
||||
### Grundstruktur
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<benutzer>
|
||||
<name>Max Mustermann</name>
|
||||
<alter>25</alter>
|
||||
<adresse>
|
||||
<stadt>Berlin</stadt>
|
||||
</adresse>
|
||||
</benutzer>
|
||||
```
|
||||
|
||||
### XML-Attribute
|
||||
|
||||
```xml
|
||||
<benutzer id="123" typ="admin">
|
||||
<name>Max</name>
|
||||
</benutzer>
|
||||
```
|
||||
|
||||
### XML in Python
|
||||
|
||||
```python
|
||||
import xml.etree.ElementTree as ET
|
||||
|
||||
# Parsen
|
||||
baum = ET.parse('daten.xml')
|
||||
wurzel = baum.getroot()
|
||||
|
||||
# Element finden
|
||||
for kind in wurzel.findall('.//kind'):
|
||||
print(kind.text)
|
||||
|
||||
# Erstellen
|
||||
root = ET.Element('daten')
|
||||
ET.SubElement(root, 'wert').text = 'test'
|
||||
baum = ET.ElementTree(root)
|
||||
baum.write('ausgabe.xml')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CSV (Comma Separated Values)
|
||||
|
||||
### Grundstruktur
|
||||
|
||||
```csv
|
||||
Name,Alter,Stadt
|
||||
Max,25,Berlin
|
||||
Anna,30,Hamburg
|
||||
Peter,28,München
|
||||
```
|
||||
|
||||
### Mit Python
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
# Lesen
|
||||
df = pd.read_csv('daten.csv', sep=',')
|
||||
|
||||
# Schreiben
|
||||
df.to_csv('ausgabe.csv', index=False)
|
||||
|
||||
# Mit Header überspringen
|
||||
df = pd.read_csv('daten.csv', skiprows=1)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Vergleich
|
||||
|
||||
| Kriterium | JSON | XML | CSV |
|
||||
|-----------|------|-----|-----|
|
||||
| Lesbarkeit | Gut | Gut | Gut |
|
||||
| Datentypen | Ja | Ja | Nein |
|
||||
| Komplexität | Niedrig | Mittel | Niedrig |
|
||||
| Dateigröße | Klein | Groß | Kleinest |
|
||||
| Einsatz | APIs | Konfiguration | Tabellarisch |
|
||||
|
||||
---
|
||||
|
||||
## Datenkonvertierung
|
||||
|
||||
### CSV zu JSON
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
# CSV lesen
|
||||
df = pd.read_csv('daten.csv')
|
||||
|
||||
# Zu JSON
|
||||
json_string = df.to_json(orient='records', indent=2)
|
||||
|
||||
# In Datei schreiben
|
||||
with open('daten.json', 'w') as f:
|
||||
f.write(json_string)
|
||||
```
|
||||
|
||||
### XML zu JSON
|
||||
|
||||
```python
|
||||
import xmltodict
|
||||
import json
|
||||
|
||||
# XML zu Dictionary
|
||||
with open('daten.xml') as f:
|
||||
daten = xmltodict.parse(f)
|
||||
|
||||
# Zu JSON
|
||||
json_string = json.dumps(daten, indent=2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Datenvalidierung
|
||||
|
||||
### JSON Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string"
|
||||
},
|
||||
"alter": {
|
||||
"type": "integer",
|
||||
"minimum": 0
|
||||
}
|
||||
},
|
||||
"required": ["name"]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Querverweise
|
||||
|
||||
- [[LF8-02-Schnittstellen|Zurück: Schnittstellen]]
|
||||
- [[LF8-04-ETL-Prozesse|Nächstes Thema: ETL-Prozesse]]
|
||||
|
||||
---
|
||||
|
||||
*Stand: 2024*
|
||||
217
2-Ausbildungsjahr/LF8-Datenintegration/LF8-04-ETL-Prozesse.md
Normal file
217
2-Ausbildungsjahr/LF8-Datenintegration/LF8-04-ETL-Prozesse.md
Normal file
@@ -0,0 +1,217 @@
|
||||
# 8.4 ETL-Prozesse
|
||||
|
||||
## Was ist ETL?
|
||||
|
||||
**ETL** = Extract, Transform, Load
|
||||
|
||||
```
|
||||
ETL - Prozess
|
||||
┌─────────┐ ┌─────────────┐ ┌─────────┐
|
||||
│ Extract │ → │ Transform │ → │ Load │
|
||||
│(Extrahieren)│ │(Transformieren)│ │(Laden) │
|
||||
└─────────┘ └─────────────┘ └─────────┘
|
||||
```
|
||||
|
||||
### Anwendungsfälle
|
||||
|
||||
| Fall | Beschreibung |
|
||||
|------|-------------|
|
||||
| Data Warehouse | Daten für Analysen konsolidieren |
|
||||
| Migration | Daten auf neues System übertragen |
|
||||
| Integration | Daten aus verschiedenen Quellen zusammenführen |
|
||||
| Reporting | Daten für Reports aufbereiten |
|
||||
|
||||
---
|
||||
|
||||
## Extract (Extraktion)
|
||||
|
||||
### Extraktionsarten
|
||||
|
||||
| Typ | Beschreibung |
|
||||
|-----|-------------|
|
||||
| Full Load | Alle Daten laden |
|
||||
| Incremental | Nur neue/geänderte Daten |
|
||||
| CDC | Change Data Capture |
|
||||
|
||||
### Quelldaten
|
||||
|
||||
```
|
||||
Extraktionsquellen
|
||||
├── Datenbanken
|
||||
├── Dateien (CSV, JSON, XML)
|
||||
├── APIs
|
||||
└── Streams
|
||||
```
|
||||
|
||||
### Beispiel
|
||||
|
||||
```python
|
||||
# Datenbank
|
||||
df_quelle = pd.read_sql("SELECT * FROM tabelle", conn)
|
||||
|
||||
# API
|
||||
response = requests.get('https://api.example.com/daten')
|
||||
df_quelle = pd.DataFrame(response.json())
|
||||
|
||||
# Datei
|
||||
df_quelle = pd.read_csv('daten.csv')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Transform (Transformation)
|
||||
|
||||
### Transformationstypen
|
||||
|
||||
```
|
||||
Transformationen - Übersicht
|
||||
├── Datentyp-Konvertierung
|
||||
├── Berechnungen
|
||||
├── Aggregationen
|
||||
├── Filterung
|
||||
├── JOINs
|
||||
├── Normalisierung
|
||||
└── Bereinigung
|
||||
```
|
||||
|
||||
### Bereinigung
|
||||
|
||||
```python
|
||||
# Fehlende Werte
|
||||
df['alter'].fillna(df['alter'].mean(), inplace=True)
|
||||
|
||||
# Duplikate entfernen
|
||||
df.drop_duplicates(inplace=True)
|
||||
|
||||
# Groß-/Kleinschreibung
|
||||
df['name'] = df['name'].str.lower()
|
||||
|
||||
# Trimmen
|
||||
df['name'] = df['name'].str.strip()
|
||||
```
|
||||
|
||||
### Berechnungen
|
||||
|
||||
```python
|
||||
# Neue Spalte
|
||||
df['umsatz_mit_mwst'] = df['umsatz'] * 1.19
|
||||
|
||||
# Bedingte Werte
|
||||
df['rabatt'] = df.apply(
|
||||
lambda x: x['umsatz'] * 0.1 if x['umsatz'] > 100 else 0,
|
||||
axis=1
|
||||
)
|
||||
```
|
||||
|
||||
### Aggregation
|
||||
|
||||
```python
|
||||
# Gruppieren und aggregieren
|
||||
df_grouped = df.groupby('kategorie').agg({
|
||||
'umsatz': 'sum',
|
||||
'anzahl': 'count'
|
||||
}).reset_index()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Load (Laden)
|
||||
|
||||
### Ladestrategien
|
||||
|
||||
| Typ | Beschreibung |
|
||||
|-----|-------------|
|
||||
| Full Load | Tabelle komplett ersetzen |
|
||||
| Incremental Load | Nur neue Daten hinzufügen |
|
||||
| Upsert | Update oder Insert |
|
||||
|
||||
### Ziel-Systeme
|
||||
|
||||
```
|
||||
Lade-Ziele
|
||||
├── Datenbank (MySQL, PostgreSQL)
|
||||
├── Data Warehouse (Snowflake, BigQuery)
|
||||
├── Datei (CSV, JSON)
|
||||
└── Cloud Storage
|
||||
```
|
||||
|
||||
### Beispiel
|
||||
|
||||
```python
|
||||
# In Datenbank laden
|
||||
df.to_sql('ziel_tabelle', conn, if_exists='replace', index=False)
|
||||
|
||||
# In CSV schreiben
|
||||
df.to_csv('ausgabe.csv', index=False)
|
||||
|
||||
# In Excel schreiben
|
||||
df.to_excel('ausgabe.xlsx', index=False)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Orchestrierung
|
||||
|
||||
### Tools
|
||||
|
||||
| Tool | Beschreibung |
|
||||
|------|-------------|
|
||||
| Apache Airflow | Workflow-Orchestrierung |
|
||||
| Luigi | Python-basiert |
|
||||
| dbt | Data Transformation |
|
||||
| Talend | ETL-Werkzeug |
|
||||
| Apache NiFi | Datenfluss |
|
||||
|
||||
### Airflow DAG
|
||||
|
||||
```python
|
||||
from airflow import DAG
|
||||
from airflow.operators.python import PythonOperator
|
||||
from datetime import datetime
|
||||
|
||||
dag = DAG('etl_pipeline', start_date=datetime(2024, 1, 1))
|
||||
|
||||
extract = PythonOperator(
|
||||
task_id='extract',
|
||||
python_callable=extract_daten,
|
||||
dag=dag
|
||||
)
|
||||
|
||||
transform = PythonOperator(
|
||||
task_id='transform',
|
||||
python_callable=transform_daten,
|
||||
dag=dag
|
||||
)
|
||||
|
||||
load = PythonOperator(
|
||||
task_id='load',
|
||||
python_callable=load_daten,
|
||||
dag=dag
|
||||
)
|
||||
|
||||
extract >> transform >> load
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Fehlerbehandlung
|
||||
|
||||
```
|
||||
ETL - Fehlerstrategien
|
||||
├── Logging
|
||||
├── Benachrichtigung
|
||||
├── Retry-Logik
|
||||
├── Quarantäne (Problem-Daten)
|
||||
└── Monitoring
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Querverweise
|
||||
|
||||
- [[LF8-03-Datenformate|Zurück: Datenformate]]
|
||||
- [[LF9-Netzwerke-Dienste|Nächstes Lernfeld: Netzwerke und Dienste]]
|
||||
|
||||
---
|
||||
|
||||
*Stand: 2024*
|
||||
Reference in New Issue
Block a user