AIMaks

Python Refresher: Data Types and Control Flow

35 min readvideoPython Foundations for ML
2 of 24Python for Machine Learning

Python Refresher: Data Types and Control Flow

Machine learning code is still Python code. Before diving into NumPy and scikit-learn, let's make sure the fundamentals are rock-solid. This lesson covers every built-in data type and control-flow pattern you'll encounter in real ML codebases.

1. Numeric Types: int, float, bool

python
# int — arbitrary precision in Python
sample_count = 50_000          # underscores for readability
big = 2 ** 100                 # no overflow!

# float — 64-bit IEEE 754 by default
learning_rate = 0.001
pi = 3.141592653589793

# bool — subclass of int (True == 1, False == 0)
is_converged = False
print(True + True)             # 2

2. Strings

python
name = "Random Forest"
description = 'A popular ensemble method'

# f-strings (Python 3.6+) — use them everywhere
accuracy = 0.9534
print(f"Model: {name}, Accuracy: {accuracy:.2%}")
# Output: Model: Random Forest, Accuracy: 95.34%

# Useful string methods for data cleaning
raw = "  Hello, World!  "
raw.strip()          # 'Hello, World!'
raw.lower()          # '  hello, world!  '
raw.replace(",", "") # '  Hello World!  '
"a,b,c".split(",")   # ['a', 'b', 'c']

3. Collections: list, tuple, dict, set

TypeMutable?Ordered?Duplicates?Example
listYesYesYes[1, 2, 3]
tupleNoYesYes(1, 2, 3)
dictYesInsertion order (3.7+)Keys: No{"a": 1}
setYesNoNo{1, 2, 3}
python
# list — your workhorse collection
scores = [0.82, 0.85, 0.91, 0.88]
scores.append(0.93)
scores.sort(reverse=True)     # [0.93, 0.91, 0.88, 0.85, 0.82]

# tuple — immutable, often used for shapes and coordinates
shape = (28, 28, 1)           # image dimensions: height, width, channels
x, y, z = shape               # tuple unpacking

# dict — map keys to values
hyperparams = {
    "n_estimators": 100,
    "max_depth": 5,
    "learning_rate": 0.1,
}
hyperparams["min_samples_split"] = 10   # add a key
print(hyperparams.get("dropout", 0.0))  # safe access with default

# set — unique elements, fast membership testing
feature_names = {"age", "income", "age", "education"}
print(feature_names)  # {'age', 'income', 'education'}
print("age" in feature_names)  # True — O(1) lookup

4. Type Conversion

python
int("42")          # 42
float("3.14")      # 3.14
str(99)            # '99'
bool(0)            # False
bool("")           # False
bool([])           # False  — empty collections are falsy
list("abc")        # ['a', 'b', 'c']
tuple([1,2,3])     # (1, 2, 3)
set([1,1,2,2,3])   # {1, 2, 3}

5. Conditional Logic: if / elif / else

python
accuracy = 0.87

if accuracy >= 0.95:
    verdict = "Excellent"
elif accuracy >= 0.85:
    verdict = "Good"
elif accuracy >= 0.70:
    verdict = "Acceptable"
else:
    verdict = "Needs improvement"

print(f"Model verdict: {verdict}")  # Good

# Ternary expression — one-liner conditional
label = "spam" if probability > 0.5 else "ham"

6. Loops: for and while

python
# for loop — iterate over any iterable
models = ["Logistic Regression", "SVM", "Random Forest"]
for model in models:
    print(f"Training {model}...")

# range() for numeric loops
for epoch in range(1, 11):    # 1 through 10
    print(f"Epoch {epoch}/10")

# enumerate — get index AND value
for i, model in enumerate(models):
    print(f"{i}: {model}")

# zip — iterate over multiple sequences in parallel
names   = ["Model A", "Model B", "Model C"]
scores  = [0.91, 0.88, 0.93]
for name, score in zip(names, scores):
    print(f"{name}: {score:.2f}")

# while loop — when you don't know iterations in advance
loss = 10.0
while loss > 0.01:
    loss *= 0.5               # simulate training
    print(f"Loss: {loss:.4f}")

7. Comprehensions

Comprehensions are a concise, Pythonic way to build new collections. They replace simple for-loops with a single expression.

comprehensions.py Run
python
# List comprehension — square each number
squares = [x ** 2 for x in range(10)]
print("Squares:", squares)

# With a filter
even_squares = [x ** 2 for x in range(10) if x % 2 == 0]
print("Even squares:", even_squares)

# Dict comprehension — invert a dictionary
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print("Inverted:", inverted)

# Set comprehension
words = ["hello", "HELLO", "Hello", "world"]
unique_lower = {w.lower() for w in words}
print("Unique:", unique_lower)

# Nested comprehension — flatten a matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [val for row in matrix for val in row]
print("Flat:", flat)
Squares: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Even squares: [0, 4, 16, 36, 64]
Inverted: {1: 'a', 2: 'b', 3: 'c'}
Unique: {'hello', 'world'}
Flat: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Up next · Functions, Modules, and Pythonic Patterns