Our Venture Developers (VD) often need to find specific data on ventures in order to run market or competitor analyses or find potential investors for our ventures. Usually, they consult Tracxn, a website that tracks startups across 230+ sectors. The data Traxcn provides is qualitative and informative, however, our venture developers struggle with the large amount of data displayed on the unintuitive dashboard. This pain point inspired our UX researcher Alex in collaboration with our Software Engineer Mikhail to facilitate and accelerate the whole process of finding relevant information for the VD team.
Their process was to firstly collect data on all funded startups in Berlin over the last years (approx. 2,400 in total) and to then put it into GitHub. From there, the project was divided into two distinct parts: data scraping and data analysis.
Data Scraping
As a pro-subscriber, we get access to Tracxn APIs and data on all companies that are based in Berlin. Mikhail set the focus on Berlin as there is a limitation from the website that only allows a certain amount of API calls each month.
A simple scraper that queried Berlin companies, combined the data, and saved it as a pickled Python object, was written in the next step.
Data Analysis
At this point in the project, Rafael, one of our Data Scientists joined the team. Rafael used Pandas to analyze the data that resulted from the scraping, and Matplotlib to plot simple graphs, as can be seen in the following graph as an example: