Organizations depend on big data frameworks to analyze and data pipelines across siloed storage systems and prepare labeled datasets for model building. This helps them train machine learning and other artificial intelligence models using existing data. Databricks’ Unified Analytics Platform is the world’s foremost platform for this.
Databricks has its roots in University of California, Berkeley. It is here where the founders of Databricks happened upon the idea for the company. The founders were working out of the AMPLab project at the university before the company was established in 2013.
Today, the company has thousands of customers including big names such as Shell, HP, Regeneron, Expedia, and Comcast. The company also an extensive global network of partners that includes Amazon, Tableau, Microsoft, Booz Allen Hamilton, and Capgemini.
Harness the power of data
The world is increasingly becoming data-oriented and this data is the basis of how organizations around the world function today. But the data needs to be harnessed the right way. Databricks helps these many organizations get their data ready for analytics, data science and to help them make data-driven decisions. The company empowers them to adopt machine learning to gain an edge over their competition.
The San Francisco-based company has been one of the foremost forces when it comes to driving data in the recent years. The company has seen a massive growth this year as it has added hundreds of new employees across various regions. The company has also rapidly expanded from its country of origin to Asia Pacific and European region quickly. Recently, the company committed $100 million for a European Development Center at Amsterdam which will oversee the company’s expansion in Europe.
Raises $400 million in Series F
In October this year, Databricks announced that it had closed a mega-funding round. In its Series F round, the company secured a massive $400 million. The round was led by Andreesen Horowitz and saw the participation of Microsoft, BlackRock, Coatue Management, Green Bay Ventuers, Alkeon Capital Management, Dragoneer Investment Group, Geodesic, T. Rowe Price, and Tiger Global Management.
The fresh funding propelled the company’s valuation to an astounding $6.2 billion. The valuation and funding shows that the investors have great faith in the direction Databricks is taking. Ben Horowitz, Co-founder and general partners of Andreesen Horowitz says, “No other company has successfully commercialized open source software like Databricks.”
The company’s work was acknowledged widely this year and it recently made it to the list of 2019 LinkedIn Top Startups List.
Ali Ghodsi, CEO and Co-Founder
He took on the role of CEO in January 2016. He previously served as the VP of Engineering and Product Management. He is also an adjunct professor at UC Berkeley and is on UC Berkeley’s RiseLab.
His academic research in resource management and scheduling and data caching have been applied to Apache Mesos and Apache Hadoop. Ali was one of the creators of the open source project, Apache Spark.
He earned his PhD in Computer Science from KTH/Royal Institute of Technology and he also has a MBA from Mid-Sweden University.
The Unified Analytics Platform helps Regeneron
Databricks works with several pharmaceutical and healthcare companies to help them in the drug discovery process. One of their customers is, Regeneron, a biotechnology company that has sequenced several hundred thousands of volunteers to pair their de-identified genetic data with de-identified electronic health records for drug discovery. But the biotechnology company faced several challenges such as:
-Decentralized genomic and clinical data made it hard to analyze and train models against their entire 10TB dataset
-It was difficult and expensive to scale the legacy architecture to support analytics on over 80 billion data points
-ETL took days to even weeks
Databricks stepped in to help. With its Unified Analytics Platform, the operations were simplified and the drug discovery process was accelerated. The results:
-Data scientists and computational biologists usually took 30 minutes to run queries on their entire dataset. They could do the same now in 3 seconds
-Automated DevOps, accelerated pipelines and improved collaboration
“Databricks’ unified platform has helped foster collaboration across our data science and engineering teams which has impacted innovation and productivity.” – John Landry, Distinguished Technologist at HP, Inc
“We were able to take a tool that previously would have been fairly localized to a single region and turn that into a global product which actually is now becoming the foundation for the way our inventory analysts will now do their work.” – Daniel Jeavons, General Manager Advanced Analytics CoE, Shell
“Databricks lets us focus on business problems and makes certain processes very simple. Now it’s a question of how we bring these benefits to others in the organization who may not be aware of what they can do with this type of platform.” – Dan Morris, Senior Director Product Analytics, Viacom
“Leverage your entire data lake, including streaming data, for the most complete BI reporting and visualizations.”
“Simple data processing on auto-scaling infrastructure. Powered by highly optimized Apache Spark™ for up to 50x performance gains.”