Python Refresher: Data Types and Control Flow
35 min readvideoPython Foundations for ML
2 of 24Python for Machine Learning
Python Refresher: Data Types and Control Flow
Machine learning code is still Python code. Before diving into NumPy and scikit-learn, let's make sure the fundamentals are rock-solid. This lesson covers every built-in data type and control-flow pattern you'll encounter in real ML codebases.
1. Numeric Types: int, float, bool
# int — arbitrary precision in Python
sample_count = 50_000 # underscores for readability
big = 2 ** 100 # no overflow!
# float — 64-bit IEEE 754 by default
learning_rate = 0.001
pi = 3.141592653589793
# bool — subclass of int (True == 1, False == 0)
is_converged = False
print(True + True) # 2
2. Strings
name = "Random Forest"
description = 'A popular ensemble method'
# f-strings (Python 3.6+) — use them everywhere
accuracy = 0.9534
print(f"Model: {name}, Accuracy: {accuracy:.2%}")
# Output: Model: Random Forest, Accuracy: 95.34%
# Useful string methods for data cleaning
raw = " Hello, World! "
raw.strip() # 'Hello, World!'
raw.lower() # ' hello, world! '
raw.replace(",", "") # ' Hello World! '
"a,b,c".split(",") # ['a', 'b', 'c']
3. Collections: list, tuple, dict, set
| Type | Mutable? | Ordered? | Duplicates? | Example |
|---|---|---|---|---|
list | Yes | Yes | Yes | [1, 2, 3] |
tuple | No | Yes | Yes | (1, 2, 3) |
dict | Yes | Insertion order (3.7+) | Keys: No | {"a": 1} |
set | Yes | No | No | {1, 2, 3} |
# list — your workhorse collection
scores = [0.82, 0.85, 0.91, 0.88]
scores.append(0.93)
scores.sort(reverse=True) # [0.93, 0.91, 0.88, 0.85, 0.82]
# tuple — immutable, often used for shapes and coordinates
shape = (28, 28, 1) # image dimensions: height, width, channels
x, y, z = shape # tuple unpacking
# dict — map keys to values
hyperparams = {
"n_estimators": 100,
"max_depth": 5,
"learning_rate": 0.1,
}
hyperparams["min_samples_split"] = 10 # add a key
print(hyperparams.get("dropout", 0.0)) # safe access with default
# set — unique elements, fast membership testing
feature_names = {"age", "income", "age", "education"}
print(feature_names) # {'age', 'income', 'education'}
print("age" in feature_names) # True — O(1) lookup
4. Type Conversion
int("42") # 42
float("3.14") # 3.14
str(99) # '99'
bool(0) # False
bool("") # False
bool([]) # False — empty collections are falsy
list("abc") # ['a', 'b', 'c']
tuple([1,2,3]) # (1, 2, 3)
set([1,1,2,2,3]) # {1, 2, 3}
5. Conditional Logic: if / elif / else
accuracy = 0.87
if accuracy >= 0.95:
verdict = "Excellent"
elif accuracy >= 0.85:
verdict = "Good"
elif accuracy >= 0.70:
verdict = "Acceptable"
else:
verdict = "Needs improvement"
print(f"Model verdict: {verdict}") # Good
# Ternary expression — one-liner conditional
label = "spam" if probability > 0.5 else "ham"
6. Loops: for and while
# for loop — iterate over any iterable
models = ["Logistic Regression", "SVM", "Random Forest"]
for model in models:
print(f"Training {model}...")
# range() for numeric loops
for epoch in range(1, 11): # 1 through 10
print(f"Epoch {epoch}/10")
# enumerate — get index AND value
for i, model in enumerate(models):
print(f"{i}: {model}")
# zip — iterate over multiple sequences in parallel
names = ["Model A", "Model B", "Model C"]
scores = [0.91, 0.88, 0.93]
for name, score in zip(names, scores):
print(f"{name}: {score:.2f}")
# while loop — when you don't know iterations in advance
loss = 10.0
while loss > 0.01:
loss *= 0.5 # simulate training
print(f"Loss: {loss:.4f}")
7. Comprehensions
Comprehensions are a concise, Pythonic way to build new collections. They replace simple for-loops with a single expression.
comprehensions.py
Run
# List comprehension — square each number
squares = [x ** 2 for x in range(10)]
print("Squares:", squares)
# With a filter
even_squares = [x ** 2 for x in range(10) if x % 2 == 0]
print("Even squares:", even_squares)
# Dict comprehension — invert a dictionary
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print("Inverted:", inverted)
# Set comprehension
words = ["hello", "HELLO", "Hello", "world"]
unique_lower = {w.lower() for w in words}
print("Unique:", unique_lower)
# Nested comprehension — flatten a matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [val for row in matrix for val in row]
print("Flat:", flat)
Squares: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Even squares: [0, 4, 16, 36, 64]
Inverted: {1: 'a', 2: 'b', 3: 'c'}
Unique: {'hello', 'world'}
Flat: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Up next · Functions, Modules, and Pythonic Patterns