Given, a complex dataframe df: (expand for full code)
from skrub.datasets import fetch_employee_salaries
dataset = fetch_employee_salaries()
df = dataset.X
y = dataset.y
df
gender department department_name division assignment_category employee_position_title date_first_hired year_first_hired
0 F POL Department of Police MSB Information Mgmt and... Fulltime-Regular Office Services Coordinator 09/22/1986 1986
1 M POL Department of Police ISB Major Crimes... Fulltime-Regular Master Police Officer 09/12/1988 1988
... ... ... ... ... ... ... ... ...
9226 M CCL County Council Council Central Staff Fulltime-Regular Manager II 09/05/2006 2006
9227 M DLC Department of Liquor Control Licensure, Regulation... Fulltime-Regular Alcohol/Tobacco Enforcement Specialist II 01/30/2012 2012
from sklearn.model_selection import cross_val_score
from skrub import tabular_learner
cross_val_score(tabular_learner('regressor'), df, y)
array([0.89370447, 0.89279068, 0.92282557, 0.92319094, 0.92162666])