November 30, 2015
IBM has fulfilled its promise to open-source SystemML, a machine learning system that’s now been accepted as an Apache Incubator project. It’s a significant milestone for SystemML, which is already used to power IBM’s Big Data analytics platform. The Apache Incubator program is a kind of stepping stone on the way to becoming a full project under The Apache Software Foundation, where developers ensure code donations adhere to the ASF’s guidelines and that the community follows its principles.
The SystemML technology emerged from IBM’s development of Watson, and integrates closely with another Apache project, Spark. SystemML helps Watson to keep up to date by providing a language that directly exposes the capabilities of the artificial intelligence so data scientists can harvest it. Queries are written in syntax modeled after the popular R statistical programming framework, before being executed according to the most efficient mode of operation for the specific workload and operational characteristics of a Spark cluster.
IBM said it would be donating SystemML to The Apache Foundation back in June this year, and the project has already hit a significant number of milestones since then, including more than 320 patches including APIs, Data Ingestion, Optimizations and Additional Algorithms. There have also been more than 90 contributions to the Apache Spark project from IBM’s engineers, aimed at making Machine Learning compatible with Spark. IBM’s move to open-source SystemML continues a trend set by other tech giants including Google, which recently open-sourced its TensorFlow machine learning software, and Facebook, which donated artificial intelligence and machine learning tools to the existing Torch open-source project.
This is all great news for data driven enterprises, which now have an array of free, open-source machine learning tools to choose from. Whereas Google’s TensorFlow and Facebook’s Torch are designed to train neural networks, SystemML helps to broaden the ecosystem for every type of business to use.
Of course the tech giants benefit just as much, if not even more so. Open-sourcing their machine learning tools means they’ll be getting access to much more data, and that data is what helps these technologies to evolve and become even more powerful. For IBM too, it has the added bonus that if SystemML can scale, the platform could well provide a gateway for customers to try out the rest of its data analytics tools.