“Reliable service. Professional drivers. Well-maintained vehicles.” Those phrases anchor most corporate shuttle contracts, and not one of them can be enforced. When a vendor blows eight morning pickups in a week, “reliable” gives procurement nothing to point at: no number, no measurement window, no agreed data source, no consequence. The clause is decoration.
A service-level agreement is supposed to fix that. The working definition procurement teams use, from CIO’s outsourcing guidance, is blunt: an SLA states the service expected from a supplier, the metrics that measure it, and the penalties that follow a miss. For an employer running an outsourced or managed shuttle for 500 to 20,000-plus workers across shifts, the SLA is the one document that turns a vendor’s promise into an obligation your operations team can hold at a monthly review.
The procurement or vendor-management lead who owns that contract, and the facilities or operations VP who owns the daily service, need one document that does a single narrow job: convert every quality the vendor promises into a clause with a number behind it. Selecting a vendor and modeling the cost of the program are separate jobs, covered in our shuttle software buyer’s guide and the build-versus-buy TCO teardown. Public transit agencies have spent decades writing this kind of accountability into third-party bus contracts, and their metric structures port almost directly to an employer shuttle, so most of the real numbers below come from published transit-operating deals.
One test runs through every section. A clause is enforceable only when it names four things: a metric with a formula, a measurement window, a data source, and a credit or penalty tied to performance bands. Fail on any one column and the clause reverts to decoration.
The vague-promise trap that voids most shuttle SLAs
Start with the failure mode, because it is the reason to write anything down. A contract that promises “on-time, reliable service” and “clean, well-maintained vehicles” has recorded an aspiration, not an obligation. When the first bad month arrives, procurement and the vendor argue about what “reliable” meant, and the argument has no referee, because the contract never appointed one.
An SLA closes that gap by design. Ohio’s Office of Budget and Management, in its contract-performance framework for state procurement, lists the skeleton every enforceable agreement shares: the services and their service levels, how each is measured, each party’s responsibilities, an escalation path, remedies for breach, a protocol for adding or dropping metrics, and cancellation terms. Everything in this guide is that skeleton applied to moving people to and from a worksite.
Four conditions have to hold for a single clause to survive a dispute.
First, a metric with a formula. “On-time” is not a metric; “trips departing a scheduled timepoint no more than 1 minute early and 5 minutes late, divided by completed trips” is. If two readers can compute different numbers from the same clause, you do not have a metric yet.
Second, a measurement window. Every metric needs a period and a boundary: a calendar month, a rolling quarter, peak hours only or all service hours. Punctuality averaged across a full service day quietly buries the 6 a.m. shift-change failures that actually cost the employer a production line.
Third, a named data source. Whose system of record produces the number, the vendor’s dispatch log, GPS telematics, a badge-in feed at the destination gate, or the rider app? When the only source is a spreadsheet the vendor maintains, the vendor grades its own homework.
Fourth, the money. A metric with no credit or penalty attached only produces a report. The consequence is what makes the number binding.
Fail one of the four and the clause cannot be enforced, however solid the other three look. A 95% punctuality target, measured monthly, backed by a $10,000 penalty, is worthless if the underlying data lives in a log the vendor can edit after the fact. That is the trap, and closing it is where the drafting work actually sits. The table below is the map for the rest of this guide; each row is a clause, and each column is one of the four tests. The wider program decisions around operating model and KPIs live in the employee transportation management guide.
| SLA clause | Metric (formula) | Window | Data source | Consequence type |
|---|---|---|---|---|
| On-time performance | On-time trips ÷ completed trips, against a set early/late window | Peak hours, monthly | GPS/AVL timestamps vs. schedule | Credit band by % |
| Service availability | Completed trips ÷ scheduled trips | Monthly | Pull-out log + GPS | Per-missed-trip damages |
| Capacity | Denied boardings; peak load factor | Per trip, rolled up monthly | Boarding counts / app | Per-incident credit + daily availability penalty |
| Continuity | Mean distance between failures; substitution time | Rolling quarter | Maintenance record + telematics | Band-based incentive/penalty |
| Safety & compliance | Preventable accidents per 100,000 mi; testing conformance | Quarterly | Incident log; audit | Rate penalty; cure or termination |
| Communication | Complaint-filing ratio; ETA lead time; hold time | Monthly | Ticketing + phone system | Band credit; per-day penalty |
| Reporting & audit | On-time report delivery; audit access | Monthly / on demand | Vendor report + raw data | Per-day late-report penalty |
On-time performance: define the window before you set the target
Set the target and skip the window, and you have agreed to nothing. On-time performance is on-time trips divided by completed trips, but the definition of “on-time” is a variable, not a constant. There is no single national standard for it. The most widely used transit window treats a vehicle as on time if it departs no earlier than 1 minute before and no later than 5 minutes after the scheduled time. San Francisco and Missouri agencies tighten the late side to 4 minutes; Washington, DC loosens both ends to 2 minutes early and 7 late, as TransitCenter and Greater Greater Washington have documented.
That spread is the whole game. A 95% target against a 7-minute late window is a far weaker promise than 90% against a 3-minute one, and a vendor who negotiates the loose window while conceding a high percentage has won the clause. Lock the early tolerance especially tight. A shuttle that leaves early strands the rider who arrived on time, which is a worse failure than a late departure, so the early side commonly sits at zero to 1 minute.
| Standard | Early tolerance | Late tolerance |
|---|---|---|
| Industry-typical | 1 minute | 5 minutes |
| San Francisco / Missouri | 1 minute | 4 minutes |
| Washington, DC | 2 minutes | 7 minutes |
What percentage should the target be? No published employer-shuttle standard exists, so treat any number you set as an internal choice rather than a benchmark you can cite. Real transit contracts give useful reference points. Nassau County’s NICE Bus operations contract sets a 70% on-time standard, measured as arriving between 1 minute early and 5 minutes late, with a $5,000 quarterly incentive-or-penalty attached (National Academies, 2023). San Diego’s system runs closer to 84%. Demand-responsive paratransit, which is the closest analog to a rostered shuttle, contracts far higher: 90% to 96%, with systems using a 30-minute window averaging 93%.
A public bus fighting mixed traffic across hundreds of stops earns that 70% floor. An employer shuttle with a fixed roster, a handful of stops, and a dedicated route has a much simpler job, so many employers write a target in the low-to-mid 90s. Just remember the number is only as strong as the window behind it, and that no standards body has blessed a specific figure for private shuttles. Measure it at the times that matter. On-time performance averaged over a 14-hour service day can read 95% while every shift-change run is late, so window the metric to peak arrival periods where a miss means someone clocks in late.
Service availability: the completed-trip rate
Punctuality answers whether a trip ran on time. Availability answers whether it ran at all, and the two failures hurt differently. A run can post a perfect on-time score for the 40 riders who caught it while being invisible to the 12 whose 5:40 a.m. pickup never came. Those 12 do not care about the monthly average; they care that they missed a shift.
The metric is the completed-trip rate: completed trips divided by scheduled trips. Its inverse, the missed-trip rate, is where the real contract language lives. NICE Bus contracts a 0% missed-trip standard, calculated as missed pull-outs that result in a full missed trip divided by scheduled pull-outs, again with a $5,000 quarterly consequence (National Academies, 2023). MetroWest Regional Transit Authority in Massachusetts skips percentages and charges per event: $100 for a pull-out more than 5 minutes off schedule, and $500 for any trip that runs 30 or more minutes behind.
For a small employer fleet, that per-occurrence structure often beats a percentage band. A six-van operation might run 40 scheduled trips a day; one missed morning run is 2.5% of the day, a number that vanishes into a monthly average but represents real people stranded at a park-and-ride. A flat $500 per missed peak trip is felt immediately by the vendor and is trivial to audit against a pull-out log. Define “missed” precisely in the clause: a trip that never operated, or one that operated so late against the shift schedule that it delivered no rider on time. Ambiguity here is what vendors litigate.
Capacity and seat guarantees: what happens when the van is full
A rider who reaches the stop and cannot board because the vehicle is full has suffered a complete service failure, and your on-time score will not show it. Punctuality and availability both assume the rider got on. Capacity is the clause that protects the rider who could not.
Write it as a load-factor or occupancy band with a hard consequence for denied boardings. Decide up front whether the SLA guarantees a seat or permits standing, because that single choice resizes the fleet. Then handle the peak. Shift changes and event days produce demand spikes that a base schedule cannot absorb, and the clause needs a surge obligation: a contracted number of spare vehicles or the ability to add capacity within a defined notice period. Demand concentrates in predictable ways, and the midweek commute peak is a good example of the pattern to size against.
Tie the surge obligation to money, the way transit does. San Diego’s contract charges the operator $2,000 per day for failing to field enough buses to run the scheduled service (National Academies, 2023). An employer SLA can mirror that with a two-part remedy: a per-denied-boarding credit for the individual failure, and a daily penalty when contracted capacity is not provided at all. The stakes are higher than an inconvenience. When the San Francisco Municipal Transportation Agency evaluated its commuter shuttle program roughly a decade ago, about 45% of surveyed riders did not own a car. For a large share of a workforce, a full shuttle is not a delay; it is a missed shift with no fallback.
Continuity: breakdowns, substitutions, and driver no-shows
“Well-maintained vehicles” becomes measurable through mean distance between failures, or MDBF: vehicle revenue miles divided by major mechanical failures. The federal government already defines the hard term. Under FTA’s National Transit Database, a major mechanical failure is a failure of a mechanical element that prevents the vehicle from completing a scheduled trip or from starting the next one, excluding collisions, vandalism, and natural disasters. The failure categories are specific: brakes, doors, steering, engine cooling, transmission, and the electrical system.
San Diego’s contract puts a real band on it. A distance-between-mechanical-failures result of 8,000 to 8,999 miles carries a $10,000 swing, paid as an incentive when the operator does better and a penalty when it does worse (National Academies, 2023). That symmetry matters, and the credits section returns to it.
MDBF has a blind spot, though: it tells you nothing about the rider stuck beside a dead van this morning. Pair the reliability metric with a same-day recovery clause. Require a replacement vehicle on scene within a defined number of minutes, a guaranteed backup driver so that a single call-out does not cancel a run, and a recovery-time metric that clocks how long a disrupted route takes to return to schedule. A driver no-show is really an availability failure with a continuity remedy, and the remedy is a spare driver the contract obligates the vendor to keep. Watch the window here as well: a six-van fleet will not accumulate 8,000 miles between failures inside a single month in any statistically meaningful way, so measure MDBF over a rolling quarter and the recovery metric per event.
Safety and compliance you require by reference
Here is the framing procurement teams get wrong. The federal safety rules that govern professional bus operations bind FTA-funded transit and its contractors, not your private shuttle vendor automatically. You do not get to say “the law requires it.” You require it by reference, writing the standard into the contract as an obligation the vendor accepts.
Start with driver testing. Require conformance to 49 CFR Part 655, the FTA drug-and-alcohol rule, run under the Department of Transportation’s Part 40 procedures. That standard covers six testing occasions, pre-employment, random, post-accident, reasonable-suspicion, return-to-duty, and follow-up, and bars any employee from a safety-sensitive function at an alcohol concentration of 0.02 or higher. Naming the standard by citation is cleaner than drafting your own testing protocol, and it is a bar every serious operator already clears.
Normalize safety to a rate, not a raw count. NICE Bus contracts a preventable-accident standard of 1.2 per 100,000 miles, backed by the same $5,000 quarterly mechanism; Sun Tran in Tucson measures preventable accidents per 100,000 revenue miles (National Academies, 2023). A rate-per-exposure is the only fair way to hold a small fleet accountable, because raw incident counts punish the vendor who simply drives more miles.
Two more sub-clauses belong here. Vehicle maintenance needs a cadence: scheduled preventive-maintenance intervals, a daily pre-trip inspection, and a defect-repair clock. Accessibility needs teeth if any rider depends on it. A compliant lift is built for a minimum 600-pound load and a 30-by-48-inch platform, and the operator must keep accessibility features in operative condition and repair them promptly, per the ADA National Network and the FTA’s shared-mobility guidance. That maintain-in-operative-condition duty is itself an uptime metric you can measure and enforce, not a box to check once. Close the section with insurance and indemnity: minimum coverage limits, additional-insured status for the employer, and a clear indemnification clause. These are the boilerplate that turns into the only thing anyone reads after an incident.
Communication, ETA, and incident escalation
Riders forgive a delay they were warned about. They do not forgive standing in the dark at a park-and-ride, guessing whether the van is coming. The communication clause governs what the rider knows and when, and it is as measurable as anything mechanical.
Give rider-facing notification a lead-time standard. When a run is delayed or cancelled, the SLA should require a push to the affected riders within a set number of minutes, over whatever channel the program uses, whether that is a rider app, SMS, or a WhatsApp broadcast. Then measure rider experience with a complaint-filing ratio, the count of complaints per 10,000 passenger trips. A review of New Orleans paratransit service recommends tying it to money: a ratio under 10 earns the contractor an incentive, and one above 30 triggers liquidated damages (Texas A&M Transportation Institute, 2023). Add a support-line standard while you are there. The same review found a service contract that caps average call hold-time at 2 minutes, with a $100-per-day penalty for every day the standard is missed.
Escalation is the piece most SLAs leave vague, and it is the piece that matters most at 6 a.m. on a Monday. Define severity tiers, distinguishing a single late run from a route-wide outage that leaves a whole shift without transport, and attach a response clock to each. Name a human, not “the vendor.” An on-call operations contact reachable within a set number of minutes, with a named backup, is the difference between a problem solved before the shift starts and a voicemail returned at noon.
Reporting, data ownership, and audit rights: the evidence layer
This is the hinge the whole document swings on. An SLA you cannot measure independently is decorative, and CIO’s guidance makes the point directly: enforcement requires the customer to verify the service levels, which is why providers are expected to make the underlying statistics available so the customer can check its own entitlement to credits. Every clause above assumes a trustworthy number. This section is where that number comes from.
Specify the monthly performance report in detail. It should carry each metric, the window it was measured over, the raw counts rather than only the percentages, an exceptions log listing every missed or late trip, and the credit-or-penalty calculation for the period. Then go past the report to the data underneath it. The employer should own the raw trip data, the GPS timestamps, boarding counts, and incident records, in a machine-readable export, not merely a monthly PDF. Pair that with a right to audit: the employer or its agent can inspect the raw data and recompute the vendor’s numbers. Without it, you are back to the vendor grading its own homework, the exact failure the framing section warned about.
Treat the SLA as a governed process, not a filed document. ISO/IEC 20000-1, the IT service-management standard, formalizes service-level management as a named discipline of agreeing service levels, monitoring against them, and reviewing at set intervals; the same cadence applies to a shuttle, with a standing monthly or quarterly review where the numbers are read aloud and the metric set is adjusted. Resist the temptation to stuff the contract with 20 indicators. The Transportation Research Board’s performance-measurement guidebook favors a small set of measurable indicators, each tied to a data source, over a long list nobody reviews. Six to eight metrics with real consequences beat twenty that decorate a dashboard.
An independent measurement layer is what makes this section real rather than aspirational. This is the single place Ryde belongs in a shuttle SLA: sitting alongside the operator as the system of record, with GPS and ETA tracking, on-time-performance data, an analytics dashboard, and an audit trail that timestamps every trip, so the on-time number in the monthly report comes from the platform, not the operator’s spreadsheet. Ryde supplies the evidence; the vendor still runs the fleet. The analytics and reporting page covers what an independent data pipeline needs to produce.
Credits, penalties, incentives, and the exit ramp
Two remedy types do the work. A service credit reduces the fees the employer owes for the period the vendor missed a standard; a financial penalty is a separate payment the vendor makes for the breach (CIO). Credits are easier to run because they net against an invoice you already control, so most shuttle SLAs lead with credits and reserve outright penalties and termination for repeated or safety-related failures.
The common objection is that penalties sour the relationship. The transit answer is to make the schedule symmetric. San Diego’s system scores fleet-wide on-time performance on a progressive scale that runs from a $7,500 bonus down to a $10,000 penalty across defined bands (National Academies, 2023). It layers on an operator-quality measure: when fewer than 10% of drivers fall below 70% on-time, the operator earns a bonus that scales from $2,500 up to $15,000. A schedule built that way rewards over-performance instead of only fining misses, which keeps a good vendor invested rather than defensive.
Three design rules keep the schedule honest. Cap the downside, so a single catastrophic month does not bankrupt the vendor and detonate the relationship. Allow earn-back, so a vendor that recovers and sustains performance can win credits back. And tie the largest consequences to the metrics that actually hurt your operation, on-time performance at shift change and denied boardings, rather than spreading them evenly across a flat monthly fee. Match the shape to the metric, too: per-occurrence damages suit availability, where each missed 6 a.m. run is a discrete felt event, while bands suit aggregate measures like on-time performance and MDBF.
Every SLA also needs a way out that is not a lawsuit. Federal procurement guidance builds two standard exits into every contract, and they are worth copying. A termination for convenience lets the buyer end the agreement when that serves its interest, paying the vendor for work performed up to that point. A termination for default, or cause, ends it for the vendor’s failure; and if a default termination is later found unjustified, FTA’s procurement manual notes it converts to a convenience termination, which is why the cause threshold has to be written precisely.
Define cause in numbers. Three consecutive months below your on-time floor, or two safety-standard breaches in a quarter, after a stated cure period, is a defensible trigger; “unsatisfactory service” is not. Then require transition assistance and data handover on exit: your roster, route definitions, and trip history returned in a usable format, so the next operator or an in-house team starts from data rather than zero. That last point ties straight back to the data-ownership clause. Pilots that never set these thresholds tend to collapse in ambiguity, which is one reason shuttle pilots stall around week six, and getting the exit terms right is part of the same discipline that keeps transportation costs from drifting after the contract is signed.
Frequently Asked Questions
What is a good on-time performance target for an employee shuttle?
There is no published employer-shuttle standard, so any target you set is an internal decision, not an industry benchmark you can cite. For reference, public fixed-route bus contracts run 70% to 85% (Nassau County’s NICE Bus contracts 70%), while demand-responsive paratransit runs 90% to 96%, with a 30-minute-window average of 93% (National Academies, 2023). A rostered employer shuttle with few stops resembles the paratransit case more than the public-bus case, so many employers write a target in the low-to-mid 90s. Whatever number you pick means little without a defined early/late window behind it, because 95% against a 7-minute late window is a weaker promise than 90% against a 3-minute one.
How is on-time performance calculated for a shuttle?
On-time performance is the number of on-time trips divided by the number of completed trips, expressed as a percentage. A trip counts as on-time when it departs or arrives at a scheduled timepoint inside an agreed window, commonly no more than 1 minute early and 5 minutes late (TransitCenter). The window is part of the definition, not a footnote, so agree it before you agree the target.
What is the difference between a service credit and a penalty?
A service credit reduces the fees you owe for the period the vendor missed the standard, while a financial penalty is a separate payment the vendor makes for the breach (CIO). Credits are simpler to administer because they net against an invoice you already control, which is why most shuttle SLAs lead with credits and hold penalties and termination in reserve for repeated or safety-related failures.
Which KPIs belong in a shuttle vendor contract?
Six to eight, each with a formula, a window, a data source, and a consequence. The core set covers on-time performance against a defined window, the completed-trip rate, denied boardings or load factor, mean distance between failures for maintenance, preventable accidents per 100,000 miles for safety, and a complaint-filing ratio for rider experience, with metric structures drawn from published transit contracts (National Academies, 2023). Add an accessibility-uptime metric if any rider depends on a lift or ramp. Resist longer lists; the Transportation Research Board’s guidance favors a few measurable indicators tied to a data source over twenty that nobody reads at the review.
Start with the contract you already have
Run every clause in your current shuttle contract past one test before you renew or re-bid it: does it name a metric with a formula, a measurement window, a data source, and a consequence? A clause missing any of the four is decoration, and now you know how to rewrite it.
Do the audit this week. Pull the contract and highlight every phrase that names a quality without a number: “reliable,” “professional,” “well-maintained,” “prompt.” Each highlight is a clause to replace. Then pick the six to eight metrics that matter for your site’s shift pattern, attach a window and a data source to each, and draft the credit bands yourself before the vendor drafts them for you. Borrow the real numbers from transit contracts as your floor, and set your target where a simpler network earns it.
The employers who get the most out of a shuttle program treat the SLA as a living scorecard, read every month, not a document signed and filed. That habit, more than any single threshold, separates a program that improves from one that drifts. If you want the independent on-time and occupancy data that makes these clauses provable, rather than a number the operator reports on itself, Ryde’s team can walk you through it.
Sources
- TransitCenter, “What Does ‘On-Time’ Even Mean?” — https://transitcenter.org/bus-time-even-mean/
- Greater Greater Washington, “Use Industry Standards for Bus and Rail On-Time Performance” — https://ggwash.org/view/9463/use-industry-standards-for-bus-and-rail-on-time-performance
- National Academies of Sciences, Engineering, and Medicine (TCRP), “Third-Party Contracts for Fixed-Route Bus Operations and Maintenance: Performance Metrics” — https://www.nationalacademies.org/read/27074/chapter/6
- Federal Transit Administration, “2022 National Transit Database Full Reporting Policy Manual” — https://www.transit.dot.gov/sites/fta.dot.gov/files/2022-09/2022-NTD-Full-Reporting-Policy-Manual-1-0_0.pdf
- Federal Register (FTA), “National Transit Database Reporting Changes and Clarifications” (2017) — https://www.federalregister.gov/documents/2017/10/27/2017-23380/national-transit-database-reporting-changes-and-clarifications
- Electronic Code of Federal Regulations, “49 CFR Part 655” — https://www.ecfr.gov/current/title-49/subtitle-B/chapter-VI/part-655
- ADA National Network, “ADA and Accessible Ground Transportation” — https://adata.org/factsheet/ADA-accessible-transportation
- Federal Transit Administration, “Shared Mobility FAQs: Americans with Disabilities Act (ADA)” — https://www.transit.dot.gov/regulations-and-guidance/shared-mobility-faqs-americans-disabilities-act-ada
- CIO.com, “What is an SLA? Best practices for service-level agreements” — https://www.cio.com/article/274740/outsourcing-sla-definitions-and-solutions.html
- State of Ohio, Office of Budget and Management, “Contract Performance and Service Level Agreements” — https://archives.obm.ohio.gov/Files/Major_Project_Governance/Resources/Resources_and_Templates/04_Plan/24_Contract_Performance_and_Service_Level_Agreements.PDF
- Federal Transit Administration, “Best Practices Procurement & Lessons Learned Manual (Report 0105)” — https://www.transit.dot.gov/funding/procurement/third-party-procurement/best-practices-procurement-manual
- Transportation Research Board, “TCRP Report 88: A Guidebook for Developing a Transit Performance-Measurement System” — https://onlinepubs.trb.org/onlinepubs/tcrp/tcrp_report_88/guidebook.pdf
- International Organization for Standardization, “ISO/IEC 20000-1:2018 Information technology — Service management” — https://www.iso.org/standard/70636.html
- Texas A&M Transportation Institute, “New Orleans Paratransit Study, Final Report Volume II: JP Transit / MITS” (February 2023) — https://www.norpc.org/wp-content/uploads/2024/11/Final-Report-Volume-II-MITS-021523.pdf
- San Francisco Municipal Transportation Agency, “Commuter Shuttle Pilot Program Reduces Conflicts with Muni, Takes Cars off the Streets, According to Evaluation” — https://www.sfmta.com/press-releases/commuter-shuttle-pilot-program-reduces-conflicts-muni-takes-cars-streets-according-0