Braithwaite I/O

Pandas & Python Slugify

05 September 2018


I have been using Val Neekman's Python library, Python Slugify to clean up Pandas columns.

In [1]:
import pandas as pd
In [9]:
df = pd.read_html('', header=0)[0]

Name Photo Location Established[12] Area (2017)[12] Natural region[13] Description
0 Akami-Uapishkᵁ-KakKasuak-Mealy Mountains(Reserve) NaN Newfoundland and Labrador53°24′N 59°22′W / 5... 2015 10,700 km2 (4,131 sq mi) East coast boreal The park includes a portion of the glacially-r...
1 Aulavik NaN Northwest Territories73°42′N 119°55′W / 73.7... 1992 12,200 km2 (4,710 sq mi) Western arctic lowlands Located on the northern part of Banks Island, ...
2 Auyuittuq NaN Nunavut67°53′N 65°01′W / 67.883°N 65.017°W 2001 19,089 km2 (7,370 sq mi) Northern Davis region One of Canada's largest parks and located almo...
3 Banff‡ NaN Alberta51°30′N 116°0′W / 51.500°N 116.000°W 1885 6,641 km2 (2,564 sq mi) Rocky Mountains The first park established by the federal gove...
4 Bruce Peninsula NaN Ontario45°14′N 81°37′W / 45.233°N 81.617°W 1987 125 km2 (48 sq mi) St. Lawrence lowlands Formed from lands previously designated Ontari...
In [11]:
Index(['Name', 'Photo', 'Location', 'Established[12]', 'Area (2017)[12]',
       'Natural region[13]', 'Description'],
In [12]:
from slugify import slugify

df.columns = [slugify(col, to_lower=True, separator='_') for col in df.columns]

Index(['name', 'photo', 'location', 'established_12', 'area_2017_12',
       'natural_region_13', 'description'],
In [14]:
name photo location established_12 area_2017_12 natural_region_13 description
42 Vuntut NaN Yukon68°22′N 139°51′W / 68.367°N 139.850°W 1993 4,345 km2 (1,678 sq mi) Northern Yukon Adjacent to the Ivvavik National Park and the ...
43 Wapusk NaN Manitoba57°46′N 93°22′W / 57.767°N 93.367°W 1996 11,475 km2 (4,431 sq mi) Hudson—James lowlands Created from a portion of the provincial Churc...
44 Waterton Lakes‡[f] NaN Alberta49°03′N 113°55′W / 49.050°N 113.917°W 1895 505 km2 (195 sq mi) Rocky Mountains Coupled with American neighbour Glacier Nation...
45 Wood Buffalo‡ NaN Alberta Northwest Territories59°23′N 112°59′W... 1922 44,972 km2 (17,364 sq mi) Northern boreal plains The largest park in Canada, the park protects ...
46 Yoho‡ NaN British Columbia51°24′N 116°29′W / 51.400°N ... 1886 1,313 km2 (507 sq mi) Rocky Mountains Part of the Canadian Rocky Mountain Parks Worl...
In [ ]: