Book 4 — Data Analysis with Python

Python for All

Chapter Four — Filtering Rows

Thanasis Troboukis  ·  All Books

Book Four · Chapter Four

Filtering Rows

The ability to ask precise questions of your data — "show me only items that cost more than €3" or "show me organic dairy products" — is the engine of data journalism. This chapter teaches you boolean filtering in pandas.

Boolean Masks — True/False for Every Row

Filtering in pandas works through a boolean mask: a Series of True/False values, one for each row of the DataFrame. When you apply a comparison to a column, pandas checks every row and returns True if the condition is met and False if not.

Python · Try it

      

The result is a Series of True and False values — one per row. Rows 3 (Cheese), 5 (Olive Oil), 11 (Chicken), 12 (Beef), and 13 (Salmon) cost more than €3.00, so they are True. All others are False.

Now you can use that mask to filter the DataFrame — keep only the rows where the mask is True:

Python · Try it

      

The pattern is df[condition]. Read it as: "give me the rows of df where the condition is True." Notice that the original row indices are preserved — the row that was number 3 in the original table is still number 3 in the filtered result.

Comparison Operators

You can use all the standard comparison operators in your conditions:

Python · Try it

      
Operators you can use:
== equal to  ·  != not equal to
> greater than  ·  >= greater than or equal
< less than  ·  <= less than or equal

Multiple Conditions — & and |

Real questions usually involve more than one condition: "show me organic items that cost less than €2" or "show me meat or fish items." pandas uses & for AND and | for OR. Each condition must be wrapped in parentheses.

Python · Try it — AND condition

      
Python · Try it — OR condition

      
Do NOT use Python's and / or: In pandas filters you must use & and | (bitwise operators), not the words and and or. The word versions operate on single True/False values; the symbol versions work element-by-element across a whole Series. Also, always wrap each condition in parentheses to avoid operator precedence bugs.

.isin() and Negation

When you want to filter for multiple specific values in one column, writing many OR conditions becomes tedious. .isin() is the clean solution — it checks whether each value belongs to a list you provide.

Python · Try it — .isin()

      

To exclude rows that match a condition, put a tilde ~ before the condition. The tilde means NOT — it flips True to False and vice versa.

Python · Try it — negation with ~

      

Your Turn — A Price Analysis

Use filtering to answer these questions about the grocery dataset:

Python · Your turn

      
What you learned in this chapter: boolean masks and how to apply them to filter rows; comparison operators; combining conditions with & (AND) and | (OR); matching multiple values with .isin(); and inverting conditions with ~. In the next chapter you will compute summary statistics on filtered and unfiltered data.

Chapter Navigation

Move between chapters.

Loading Python environment — this may take a moment…