Part One
Boolean Masks — True/False for Every Row
Filtering in pandas works through a boolean mask: a Series of True/False values, one for each row of the DataFrame. When you apply a comparison to a column, pandas checks every row and returns True if the condition is met and False if not.
The result is a Series of True and False values — one per row. Rows 3 (Cheese), 5 (Olive Oil), 11 (Chicken), 12 (Beef), and 13 (Salmon) cost more than €3.00, so they are True. All others are False.
Now you can use that mask to filter the DataFrame — keep only the rows where the mask is True:
The pattern is df[condition]. Read it as: "give me the rows of df where the condition is True." Notice that the original row indices are preserved — the row that was number 3 in the original table is still number 3 in the filtered result.
Part Two
Comparison Operators
You can use all the standard comparison operators in your conditions:
== equal to · != not equal to> greater than · >= greater than or equal< less than · <= less than or equal
Part Three
Multiple Conditions — & and |
Real questions usually involve more than one condition: "show me organic items that cost less than €2" or "show me meat or fish items." pandas uses & for AND and | for OR. Each condition must be wrapped in parentheses.
and / or: In pandas filters you must use & and | (bitwise operators), not the words and and or. The word versions operate on single True/False values; the symbol versions work element-by-element across a whole Series. Also, always wrap each condition in parentheses to avoid operator precedence bugs.
Part Four
.isin() and Negation
When you want to filter for multiple specific values in one column, writing many OR conditions becomes tedious. .isin() is the clean solution — it checks whether each value belongs to a list you provide.
To exclude rows that match a condition, put a tilde ~ before the condition. The tilde means NOT — it flips True to False and vice versa.
Part Five
Your Turn — A Price Analysis
Use filtering to answer these questions about the grocery dataset:
- Which items are organic and cost more than €2?
- Which items are from the "Vegetables" or "Grains" category and cost less than €1?
- How many non-organic items are there? (Hint: filter, then check
.shape[0])
& (AND) and | (OR); matching multiple values with .isin(); and inverting conditions with ~. In the next chapter you will compute summary statistics on filtered and unfiltered data.
Chapter Navigation
Move between chapters.