131 lines
2.1 KiB
Markdown
131 lines
2.1 KiB
Markdown
# 8.1 Datenquellen
|
|
|
|
## Arten von Datenquellen
|
|
|
|
### Datenbanken
|
|
|
|
```
|
|
Datenbank-Typen
|
|
├── Relational (MySQL, PostgreSQL, Oracle)
|
|
├── NoSQL (MongoDB, Cassandra)
|
|
├── Graph (Neo4j)
|
|
└── Zeitreihen (InfluxDB)
|
|
```
|
|
|
|
### Dateisysteme
|
|
|
|
| Format | Typ | Einsatz |
|
|
|--------|-----|----------|
|
|
| CSV | Text | Tabellarische Daten |
|
|
| JSON | Text | Strukturierte Daten |
|
|
| XML | Text | Konfiguration, Austausch |
|
|
| Excel | Binär | Tabellenkalkulation |
|
|
| PDF | Binär | Dokumente |
|
|
|
|
### Externe Quellen
|
|
|
|
```
|
|
Externe Datenquellen
|
|
├── APIs (REST, SOAP)
|
|
├── Webservices
|
|
├── Cloud-Speicher
|
|
├── Stream-Daten (IoT)
|
|
└── Drittanbieter
|
|
```
|
|
|
|
---
|
|
|
|
## Datenbank-Anbindung
|
|
|
|
### JDBC (Java)
|
|
|
|
```java
|
|
// Verbindung zu Datenbank
|
|
Connection conn = DriverManager.getConnection(
|
|
"jdbc:mysql://localhost:3306/datenbank",
|
|
"benutzer",
|
|
"passwort"
|
|
);
|
|
|
|
// Abfrage
|
|
Statement stmt = conn.createStatement();
|
|
ResultSet rs = stmt.executeQuery("SELECT * FROM tabelle");
|
|
```
|
|
|
|
### Python (SQLAlchemy)
|
|
|
|
```python
|
|
from sqlalchemy import create_engine
|
|
|
|
engine = create_engine(
|
|
'postgresql://user:password@localhost:5432/database'
|
|
)
|
|
|
|
# Daten lesen
|
|
df = pd.read_sql("SELECT * FROM tabelle", engine)
|
|
```
|
|
|
|
---
|
|
|
|
## Dateien einlesen
|
|
|
|
### CSV
|
|
|
|
```python
|
|
import pandas as pd
|
|
|
|
# CSV einlesen
|
|
df = pd.read_csv('daten.csv')
|
|
|
|
# Mit Trennzeichen
|
|
df = pd.read_csv('daten.txt', sep='\t')
|
|
```
|
|
|
|
### JSON
|
|
|
|
```python
|
|
import json
|
|
|
|
# JSON laden
|
|
with open('daten.json') as f:
|
|
daten = json.load(f)
|
|
|
|
# Zu DataFrame
|
|
df = pd.json_normalize(daten)
|
|
```
|
|
|
|
### Excel
|
|
|
|
```python
|
|
import pandas as pd
|
|
|
|
# Excel einlesen
|
|
df = pd.read_excel('daten.xlsx')
|
|
|
|
# Bestimmtes Sheet
|
|
df = pd.read_excel('daten.xlsx', sheet_name='Tabelle1')
|
|
```
|
|
|
|
---
|
|
|
|
## Datenbanktypen im Vergleich
|
|
|
|
| Kriterium | Relational | NoSQL |
|
|
|-----------|-----------|-------|
|
|
| Schema | Fest | Flexibel |
|
|
| Skalierung | Vertikal | Horizontal |
|
|
| Transaktionen | ACID | Eventual Consistency |
|
|
| Komplexität | Mittel | Niedrig |
|
|
| Beispiele | MySQL, PostgreSQL | MongoDB, Redis |
|
|
|
|
---
|
|
|
|
## Querverweise
|
|
|
|
- [[LF8-02-Schnittstellen|Nächstes Thema: Schnittstellen]]
|
|
- [[LF3-Datenbanken|Datenbanken]]
|
|
|
|
---
|
|
|
|
*Stand: 2024*
|