July 08, 2019
It sounds like the premise of a joke: A Venture Developer, a UX Researcher, and a Data Scientist come together to found a company that initially resulted from a WATTx hackathon project. We sat down with Tristan, Alex, and Kostya to talk about their decision to found this venture, their unique team setup, the challenges it entails, as well as their excitement for the developments to come around AI and image annotation.
Tristan (T): Hasty is an AI-powered image annotation tool that speeds up the process of labelling significantly.
Kostya (K): The shortcut to ground truth.
Alex (A): Annotation and quality control simplified through the use of AI.
K: We were working on a project, with another WATTx venture: Deevio, where I had to annotate hundreds of medical caps in order to detect intermixing - manually! Deevio uses deep learning to assist on quality assurance issues in manufacturing – this is one of the use cases Hasty is good for because oftentimes they require very detailed image labelling which is extremely time-consuming. So naturally, I tried different tools but wasn’t satisfied and thought: why not use the neural networks that I already have in order to speed up the annotation process? After that, a colleague of mine and I decided to work on this idea during one of our hackathons at WATTx and that’s where Hasty was born.
A: That was in October 2018. At first, it took a couple of months where we were fiddling around with whether this should be a project or not. Then, in January 2019, we got started with actually building the tool.
T: We are at the end of our closed alpha, which means we were initially targeting key users from the industry in order to collect feedback on what is good about the tool, and what needs to be improved. So far, the feedback has been extremely positive; we’ve got some solid indications, and are on a good way to finishing a market-ready or production-ready version. In August, we’re looking to release the open beta, which will be the first commercial version of the product.
T: Generally speaking, anyone in the area of image labelling annotation strongly believes that labelling images will become commoditized. We tend to agree with that - and that’s exactly our goal.
Basically, wewant to make the manual process of labelling images as insignificant as possible by empowering the AI features that run in the background to automate the manual work required to annotate an image dataset. We’ve achieved this for some use cases, especially for instant segmentation and object detection, which is where the biggest pain points of our target users lie.
Now, we’re looking to include other features that could create a lot of impact according to the user research we did, such as automated conversion from instance to semantic segmentation or AI-powered quality assurance.
78% of image annotation and AI projects get stuck at the PoC stage due to a lack of quality or availability of the ground truth data set, and Hasty will very soon be able to solve that problem.
A: You have hundreds of thousands of images, all of them with annotations whose quality is very important - and this gets more important by the day. When you have AI coming into everyday life, with diagnosing cancer cells as malignant or benign for instance, AI goes from being a research project to being something that might impact people’s lives in both a negative and a positive way. If you have faulty data in the medical sector and many other sectors for that matter, you might - worst case scenario - kill someone. Then data quality becomes of utmost importance.
T: Robustness is also essential. We think there’s a perception that labelling images is a project-based thing: You start a project, you gather your images, you label them, you implement the AI, and you’re finished. But the reality is that there’s a big difference between what you can do in a controlled environment versus actually putting something into production when it needs to be robust. And this requires retraining your data. We believe that the importance and significance of that retraining process is heavily undervalued: retraining will always be ongoing. Things will constantly change and evolve, new products will come up, new faults will be detected, and we will see new environments, which means that you’ll constantly have to annotate new datasets.
K: And the complexity of tasks increases all the time. In the beginning, it was important to simply understand whether an image contains, for example, a person or not. Then they started to analyze exactly where the person is located, followed by their pose and what they are doing. Today, a person can be split into their smaller parts, like hands, feet, or hair. Hence nowadays, more and more labelling is required.
The three co-founders Tristan, Alex, and Kostya (from left to right)
A: From my point of view, the annotation part is working really well. If you want to do instant segmentation for instance, drawing out a polygon or mask around an object in an image, I think we’re better at that than anyone else. The same goes for bounding boxes and semantic segmentation.
K: I personally like Hasty’s user experience. A lot of the existing tools are made by very technically capable people that might lack a bit of user empathy or UX design experience: For example, we found that when you create a polygon in most of these existing tools, you can’t edit it while you’re creating it. Which means that if you make a mistake, you have to go back and start over. And this is what users might perceive as irritating, especially if they have to create a polygon with hundreds of vertices. We’re trying to improve on that, and I think we’ve done a good job so far.
T: I think the biggest challenge for Hasty today will be to understand, on a granular level, which verticals we’re the most valuable in. We have a double edged sword, which is a blessing and a curse: On the one hand, we have a tool that’s got a very strong horizontal application to almost any industry. However, what that annotation task and respective pain points look like differs - which is what we still need to understand better. It’s great that we have a horizontal tool because it expands the size of the potential market but it would be prudent for us to initially focus on the two, or maximum three vertical sectors with the largest pain points. And we still have to identify those. We have some ideas, namely: agriculture, manufacturing, transportation with autonomous driving, medical, and satellite imaging. But those are hypotheses and have yet to be proven.
T: I love the use case of medical imaging because there’s a very strong social impact attached to it. If you improve diagnoses, that’s great. Agriculture has also got a strong socio-economic impact. With those two, it goes beyond just making money. It’s doing good while doing well.
A: To me what’s exciting is that, in theory, we would allow people, organizations, companies that have not been able to do these sorts of projects before - because of the cost and lack of knowhow - to use vision-based AI. In the long run, I think it could be really cool to enable not just companies coming from the Silicon Valley to create AIs but to enable people all over the world to create AI models for whatever use case they need it for.
T: For me, there are three main points: The first one is these two guys (points at Alex and Kostya). I really like working with them.
The second is that it’s a topic where the technology behind is not trivial as we are trying to push the boundaries of what’s currently happening in research. I like the very advanced technical nature of what we’re building.
The third one is the significant implication democratizing this kind of technology and making it usable for the average person would have. That’s exciting.
A: On top of that, it’s interesting to work on a product and not mainly a service-oriented business. What you do in one iteration has value when you go to the second iteration, has value when you go to the third iteration… You’re always building on top of it. Also, having been in WATTx for three and a half years now, I think it was time to drill deeper into a specific topic.
K: For me it is a great chance to implement the most recent and awesome Machine Learning techniques, especially because some tasks are challenging.
T: It’s a lot of work and surely not a nine to five job. You do take on additional responsibilities. What we still need to find out is the real gravity of being a shareholder and owning a company, for example, being responsible for the pay-checks of people that work at Hasty and trust you with their future. That’s daunting.
On the other hand, it’s really exciting to be part of creating something and having a part in making it. There’s a sense of ownership that is not that easy to achieve when you have a pay-check and at the end of the day you could leave the job at any point. Here, you’re really tied to the future and to the success of it. So, there’s this feeling of commitment, which is tough, but also extremely rewarding because of the feeling of achievement.
T: Yes! You don’t know what you don’t know. There are a lot of guidelines you can read, and we talk to a lot of advisors, but the reality of the situation is: every company is different and it’s not always obvious if you’re missing something fundamental.
A: We talked to our users, and they love and understand what we do and why we do it. But we underestimated how tough it is to explain the value of Hasty to people that are in neither the image annotation nor computer vision spheres.
If you’re from outside the topic, it seems like what we do might not be needed for that long, with unsupervised learning becoming more feasible. In reality, as soon as you talk to people that have some know-how, they understand that what we’re doing is something that people are going to continue to have a need for at least the next several years.
K: Until we completely automate the annotation process! (laughs)
K: Hasty doesn’t exist to be the world’s best annotation tool. Our why is: The shortcut to ground truth. Your ground truth is the reference data set that you train and validate your AI on. However, you generate that ground truth, whether it’s supervised, unsupervised learning, or synthetically generated data, you can use advanced image-based augmentation, which leverages AI.
We want to be the tool that people go to to train their image-based AIs. So, when we solve the problem of image annotation, we’ll move on to the next part of the value chain: 3D images, videos, then it’s 3D videos, and so on.
T: I can give it by way of an example: We were discussing our market entry strategy the other day, and the question of what the profile of our target user would look like came up. It was really interesting to see that all three of us approached it from our traditional backgrounds. From my perspective, it was a company that has a big need for our solution and also has a big cost in creating annotations in their computer vision application.
A: For me, it was a user that gets most utility out of our tool, the one that Hasty solves the problem in the best way.
K: And for me it was the customers that add the most diversity to Hasty’s use cases and data.
T: The benefit is that we’re all approaching problems from that diversity of thought, which is a great value add and allows us to have a more complete picture.
T: We’ll need help with sales at some stage. In this day and age, and although it is changing, people still buy solutions through relationships. There still is some benefit to having senior sales people in your team to help with enterprise sales. Also, we will always be working with what’s happening at the forefront of research. So, we are already, and will continue to, work with researchers that are working on the latest approaches in computer vision.
K: The feature we’re currently building was initially nicknamed “Killer-Killer Feature”, which we’re probably going to change that to “annotation converter”. Here, we take an entire image data set that’s been annotated in a certain fashion (ie. bounding box, instance segmentation, semantic segmentation) and then translate that style of labelling from one to another. If you had labelled an entire dataset with bounding boxes, and you wanted to then convert that into segmentation, this new feature would enable that. This makes your labelled datasets far more flexible because they can be applied to different types of problems without having to reannotate.
We’re also thinking about a feature called “Finish him”. So once the user accepts AI suggestions, without significant adjustment, we can show a ‘Finish him’ button which will label the whole project automatically. And due to our AI powered quality tool, you won’t need to check each image manually. It will help to identify the weakest, missing, or misclassified labels or suggest extra labels.
T: It’s probably a little bit inspired by names like Slack, that are somewhat ironic, because Slack picks up the slack. Hasty, in one way, speeds up the way you annotate, and reduces the time you waste, so you can speed up projects.
K: And it sounds like tasty.
A: Herbert, the hedgehog. Fran, one of our frontend people working on the project, started drawing hedgehogs to test the tool because they have a lot of edges. And we said, “Let’s make our logo a hedgehog”. And so far, no one has complained.
We sat down with the three co-founders of Hasty to talk about challenges, opportunities, the...
A venture builder’s take on the “Why” and “How” to fail quickly for traditional companies...
WE ARE WATTX
We went to the Technikmuseum with Tatiana Mamaeva, UX lead at WATTx.