We set out to create a prediction model that would accurately predict the arrival of cargo ships.
There’s one big problem with global logistics today: There’s a lack of accurate, up-to-date, and accessible information for all stakeholders involved.
The problem is best illustrated through an example: a German industrial company has a supplier in China. This German industrial company has adapted a just-in-time, lean manufacturing process which emphasises the need for having a well-functioning logistics network to enable just-in-time inventory supply for their manufacturing strategy to work. It needs to be reliable as they need to know that their delivery will arrive when it is needed for manufacturing and transparent so that they can quickly adapt if anything unforeseen happens. Otherwise, they would either face expensive manufacturing down-time or they would have to go back to building in buffers in their inventory supply – defeating the point of just-in-time inventory strategies.
Logistics reliability today is arguably low. For example, ocean freight carriers have a reliability rate between 60-67%. According to our research road transports, and even train transports to a degree, also struggle with reliability issues. Furthermore, reliability is declining as heavier traffic on all transport networks has lead to more congestion and delays. Thus, the need for transparency and reliability is higher than it has ever been.
Getting insights into logistics operations is tedious due to the lack of standardisation across data sources. Often, companies only get track-and-trace data that tells them when goods have gone through a logistics hub like a port or a distribution centre. For many, that is not enough. They require data on where goods are or what is going on while a cargo is in transport as that is where delays happen – for example, when ships have to reroute because of a storm.
As a workaround, logistics operators in industrial companies spend their days on the phone, trying to get hold of someone that can give them an update on their shipment. A task that is increasingly more difficult due to the popularity of outsourcing increasing in the logistics sector. Today, most logistics operators do not know which subcontractors their freight forwarder uses. In fact, logistics operators’ often have their own contractors that are in charge of the actual transportation. Thus, even they are unaware of who the subcontractor is. This complexity leads to information being harder to get than ever before.
The same is true in reverse. If something happens to cargo when it is in transport, be it delays, damage or theft, the information flow from the trucking company to the industrial manufacturer is cumbersome and sluggish as there is little direct contact between the two. Often, it takes days before an industrial company receives information that something unforeseen has happened with their cargo.
We started this project when a logistics company told us they wanted to become more data-driven, but they had encountered problems with gathering and using data to improve their business. With our in-house data science expertise we thought that there was an opportunity for WATTx in helping the industry develop new data-oriented solutions.
First, we did comprehensive user-experience (UX) and desktop research. We started by reading everything that we could find on how the modern logistics works. We built a better understanding of the challenges and opportunities of the industry. Then we went out and interviewed over 40 logistics professionals. We spoke to freight forwarders, ocean freight carriers, in-house logistics departments, ports, and supply chain operators. This gave us insights into the challenges being faced and what professionals in the industry wanted to change or improve.
Typical questions we would look to answer where: “what is your biggest challenges? “why is there a problem?”, “what are the consequences of this problem?”, “which stakeholders are affected by it and how?” and “how often did this problem occur and where?”. This gave us insight into the depth of the problem and severity of the need. We could, further, get an indication of the willingness to pay for a solution.
One topic that kept coming up was the need for more transparency. So we decided to look into that problem. By having internal ideation sessions, we came up with a number of different ideas on how to improve transparency in logistics. In the end, we decided to focus on one idea: providing more accurate Estimated Time of Arrival (ETA) for freight transports through integrating various data sources and applying machine learning methods to the data. As a first step, we decided to concentrate on ocean freight transports. As there was a lot of publicly available data that we could leverage.
We used AIS data from the US government and combined it with data on US ports as well as weather data to create a machine learning model. By training and adjusting that model we were able to predict when a vessel would arrive with an accuracy of +/- 90 minutes*. In comparison with ocean freight schedules, where the difference between when a ship should arrive and when it actually arrives can be days or, in some cases, weeks, our model is a massive improvement.
Although our model in itself is a big step forward towards more transparency for ocean freight transports, there is room for fine-tuning and expanding our solution. Primarily, we need to have a better understanding of what factors impact the last 100 kilometers of an ocean transport. For that we need more domain knowledge and a better understanding of what potential delays a ocean transport might face before entering a port.
Hence, WATTx is looking for a company to partner with so that we can improve our model and test what we built in a real-world setting. If you work for a port, a ocean freight carrier, a freight forwarding company or if you work with logistics for a industrial company and would be interested in using what we have built, don’t hesitate to reach out to us by contacting our Head of Venture Development Tristan Rouillard.
* Not including the last 100 km of a trip