Book 4 — Data Analysis with Python

Python for All

Chapter Seven — Sorting and New Columns

Thanasis Troboukis  ·  All Books

Book Four · Chapter Seven

Sorting and New Columns

Sorting makes patterns visible. Adding computed columns turns raw prices into derived insights — price per gram, affordability scores, percentage change. This chapter covers both.

.sort_values() — Ordering Your Data

A table of prices sorted from cheapest to most expensive is far more readable than the same data in arbitrary order. .sort_values() sorts a DataFrame by one or more columns.

Python · Try it — sort ascending (cheapest first)

      
Python · Try it — sort descending (most expensive first)

      

Sort by multiple columns

You can sort by more than one column. The first column is the primary sort key; tied rows are then broken by the second column.

Python · Try it — sort by category, then by price

      

Adding New Columns

One of the most common data analysis tasks is creating a derived column — a new column calculated from existing ones. You create it simply by assigning to a new column name.

A price-per-100g column

Our dataset has prices for items sold in different weights. To compare fairly, we need a common unit. The data here uses fixed standard weights for illustration.

Python · Try it — calculated column

      

When you sort by price_per_100g, the true value ranking is different from the raw price ranking. Rice looks cheap at €0.89/kg, but so does pasta once you normalise by weight. Salmon and cheese are significantly more expensive per gram — an insight hidden in the raw prices.

A price-above-average flag

Python · Try it — boolean column

      

.apply() — Custom Transformations

Sometimes you need a transformation that cannot be expressed as simple arithmetic — for example, classifying prices into bands ("budget", "mid-range", "premium"). .apply() lets you run a custom function on each value in a Series.

Python · Try it

      
When to use .apply(): Use it for logic that cannot be expressed as a vectorised operation (arithmetic, .str methods, comparisons). Avoid it for simple maths — df["price"] * 2 is faster and cleaner than df["price"].apply(lambda x: x * 2).

String Methods with .str

pandas exposes Python string methods on text columns through the .str accessor. This lets you clean, search, and transform text without writing loops.

Python · Try it

      

Your Turn — Build a Price Index

Create a "value score" column that divides each item's price by the overall mean price. A score below 1.0 means the item is cheaper than average; above 1.0 means it is more expensive. Sort the result from best value to worst.

Python · Your turn

      
What you learned in this chapter: sorting with .sort_values() by one or multiple columns; creating new columns via arithmetic on existing ones; classifying values into categories using .apply() with a custom function; and using .str methods to work with text columns. In the final chapter, you will learn to handle missing data and put all these skills together in a complete analysis.

Chapter Navigation

Move between chapters.

Loading Python environment — this may take a moment…