This report presents the results of a K-means clustering analysis of Dutch driving schools based on quantitative characteristics such as exam success rates, total exam volume, lesson types, and retake outcomes. The dataset was filtered to exclude schools with fewer than 10 exams. Latitude and longitude were removed due to low variance, ensuring that clustering was driven by performance-related variables. The elbow method was used to determine the optimal number of clusters.
Four distinct clusters were identified and interpreted as follows:
Cluster 0 – Low Success, Modest Size
This cluster contains schools with the lowest average success rate (40.2%) and first-attempt pass rate (35.3%). They offer very few lesson types, including no theory or automatic transmission training. These may represent low-cost or low-preparation schools focused on getting students to the exam quickly.
Cluster 1 – High Success, Low Volume
Schools in this cluster achieve the highest pass rates (65.6% success, 63.8% first-attempt). However, they handle the fewest exams on average (49.7). These schools likely focus on quality instruction and smaller student bases, offering well-rounded training programs including theory and practical preparation.
Cluster 2 – Automated High-Volume Providers
This cluster is characterized by very high exam volume (205.1) and dominant use of automatic transmission vehicles (100%). They offer consistent lesson availability and maintain decent success rates (55.4% first-attempt). These are likely large-scale urban schools or franchises specializing in automatic licensing.
Cluster 3 – Balanced All-Rounders
These schools combine solid exam performance (56.5% success rate) with a full suite of lesson offerings and moderate exam volume (175.1). With both manual and automatic transmission lessons and high practical training availability, they likely represent full-service, experienced driving schools.
The principal component analysis (PCA) confirmed that success rates, exam volume, and lesson type diversity were the most influential dimensions in differentiating the clusters. This segmentation supports both students and policy makers in identifying driving school strategies based on data-driven traits.