A Data Science Central Community
0xdata (www.0xdata.com), the open source machine learning and predictive analytics company for big data, today announced general availability of the latest release of H2O, the industry's fastest prediction engine for big data users of Hadoop, R and Excel. H2O delivers parallel and distributed advanced algorithms on big data at speeds up to 100X faster than other predictive analytics providers.
The second generation H2O "Fluid Vector" release -- currently in use at two of the largest insurance companies in the world, the largest provider of streaming video entertainment and the largest online real estate services company -- delivers new levels of performance, ease of use and integration with R. Early H2O customers include Netflix, Trulia and Vendavo.
"We developed H2O to unlock the predictive power of big data through better algorithms," said SriSatish Ambati, CEO and co-founder of 0xdata. "H2O is simple, extensible and easy to use and deploy from R, Excel and Hadoop. The big data science world is one of algorithm-haves and have-nots. Amazon, Goldman Sachs, Google and Netflix have proven the power of algorithms on data. With our viral and open Apache software license philosophy, along with close ties into the math, Hadoop and R communities, we bring the power of Google-scale machine learning and modeling without sampling to the rest of the world."
"Big data by itself is useless. It is only when you have big data plus big analytics that one has the capability to achieve big business impact. H2O is the platform for big analytics that we have found gives us the biggest advantage compared with other alternatives," said Chris Pouliot, Director of Algorithms and Analytics at Netflix and advisor to 0xdata. "Our data scientists can build sophisticated models, minimizing their worries about data shape and size on commodity machines. Over the past year, we partnered with the talented 0xdata team to work with them on building a great product that will meet and exceed our algorithm needs in the cloud."
"There are many machine learning packages out there but H2O gives us advanced algorithms that scale on big data," said Todd Holloway, Data Science Lead at Trulia, the online real estate market, "We love that H2O is open source and were impressed by the talent of the 0xdata team. The project has grown rapidly and quite a few distributed algorithms have been built in a short span."
With H2O users can easily explore and model big data from within Microsoft Excel and RStudio and connect it with data from HDFS, S3, SQL and NoSQL data sources. Easy to install and deploy anywhere on a desktop, on Amazon EC2 or in place on big Hadoop clusters. With a simple click, these data models can be expressed into scoring engines ready for low latency production environments. Rather than wait for an entire job to finish, H2O provides approximate results at every step in the analysis process so users can get a general idea of results and kill a job and start over quickly if the early approximate numbers exceed an anticipated range.
H2O's in-memory columnar compression and fine-grain parallelism via Map Reduce provides unmatched speed, scale and extensibility for advanced algorithms on big data. Customers can extend the Lego-like architecture and run their own algorithms and models. Or take advantage of 0xdata's latest algorithms for Distributed Trees and Regression, such as Gradient Boosting Machine (GBM), Random Forest (RF), Generalized Linear Modeling (GLM), k-Means and Principal Component Analysis (PCA). The speed is blazing fast. H2O's GLM on a dataset with 150 million rows and 750 categorical columns clocked less than five seconds for Logistic Regression on commodity hardware.
A vibrant community of mathematicians and analysts has built up quickly around a shared interest in H2O. 0xdata has sponsored or participated in more than two-dozen meet-ups in the Bay Area alone since April 2013. For more information on upcoming H2O meet-ups, please visit http://0xdata.com/events/ or join the movement at https://github.com/0xdata/h2o.
0xdata develops open source H2O, the world's fastest in-memory platform for machine learning and predictive analytics on big data. Running advanced algorithms such as GBM, GLM, PCA and RF, among others, users can get to interim and final results quickly to help them make better data-driven decisions faster. 0xdata is based in Silicon Valley and is backed by Nexus Venture Partners along with other leading angel investors in big data.