I recently got introduced to the concept of Split Ticketing, where you split up your train journey into multiple segments to reduce the overall cost.
This may seem strange, but it is in part due to reducing the flexibility of your route, and by doing this you can reduce the cost.
This got me thinking, how would you design such a system; there are some currently available, such as Money Saving Experts demo or Split Your Ticket.
Getting from A to B
In order to work out the cheapest route, you need to know the route to begin with. This area of Computer Science is pretty much covered in terms of the basics (although research continues in the areas of optimising large data sets). The front runners are A* and Dijkstra’s algorithm. A* seems the best for us as the dataset (train stations) can be put in a logical order where a heuristic can be easily designed.
A* works by attempting to plot a route between 2 points via intermediate points by calculating a weighting between each point, A* excels because it follows the path of least cost (by utilising a process queue).
The points will be the train stations that can be visited, next we need to work out the weighting heuristic. Things get a bit more interesting though as you can get trains that traverse multiple stations in ‘one leap’ rather than buying tickets per station as well. However this gets handled by A* as the weight would be automatically adjusted and computed.
In real life…
According to the ‘Office of Rail Regulation’ there are currently 2352 stations in the UK, and they conveniently provide a spreadsheet with their names and postcodes. By using Ordance Surveys OpenData project I can for free convert these postcodes into grid references that I can use to easily plot locations and calculate distances.
How much will it cost?
There is not just the factor of cost that needs (or can) be factored in, overall there are several factors to consider:
- Cost of ticket between two stations
- Journey time between stations
- Wait time at each station connection
The first two are fixed costs however the 3rd is dependent on the connecting train, therefore it’s weight would be placed on the n+1 segment of that particular route.
For the purpose and entire concept of split ticketing, the cost of the ticket is the major factor. I feel the weight time then becomes the 2nd most important (no one wants a 8hour wait overnight because the tickets are cheap)
- 60% – Cost of ticket (pounds) between two stations (C) – Lower is better
- 30% – Wait time (minutes) at each station (W) – Lower is better
- 10% – Journey time (minutes) (T) – Lower is better
Hn = 0.6 * Cn * W1 + 0.3 * Wn-1 * W2 + 0.1 * Tn * W3
C = Hf
Hn < C
Putting it all together
The cost of train fairs between stations can be worked out using the open source librailfare API which most train operators use. The journey times, and availability of trains (to work out the wait times at the station) would need to be gathered from the train operators.
All this could be knocked up using ASP.Net, but Azure would seem like a great oppurtunity (parrallel processing anyone!)