A Data Science Central Community
Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order O(n) and even faster such as O(n/log n). This is much faster than good Hierarchical Agglomerative Clustering…Continue
The buzz in the big data world from 2012 is around the new position, Data Scientist.
But is this just hype—a new title for what is really a statistician or SQL power-user position?
Or is this a new way to approach big data—one that is critical to your success?
Below a 57’ conference try to response to the question:
Editor-in-Chief: Edd Dumbill
ISSN: 2167-6461 • Published Quarterly • Online ISSN: 2167-647X
Current Volume: 1…
Added by Vincent Granville on February 14, 2013 at 12:00pm — No Comments
On the one hand, the advent of Big Data delivers the cost-effective prospect to improve decision-making in critical development areas such as health care, employment, economic productivity, crime and security, and natural disaster and resource management. This provides a wealth of opportunities for developing countries. On the other hand, all the well-known caveats of the Big Data debate, such as privacy concerns…Continue
Added by Martin Hilbert on February 9, 2013 at 12:11pm — No Comments
To foster the study of the structure and dynamics of Web traffic networks, we make available a large dataset (‘Click Dataset’) of about 53.5 billion HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows one to examine…Continue
Added by Vincent Granville on February 8, 2013 at 11:13am — No Comments
If Big Data is the way of the future, who is driving that future, how are they doing it and what can today's businesses do to be a part of this next wave?
In an effort to collaboratively explore this opportunity, I began a LinkedIn group dedicated to the discussion around Big Data’s future. The group is a growing community of analysts, developers, vendors, journalists and…Continue
Added by Radhika Subramanian on February 8, 2013 at 10:30am — No Comments
The Big Data Analytics revolution is underway. This revolution is an historic and game-changing expansion of the role that information plays in business, government and consumer realms. To harness the power of this data revolution, a paradigm shift is required. Organizations must be able to do more than query their Big Data stores; search is no longer enough. We …Continue
Added by Radhika Subramanian on February 1, 2013 at 11:04am — No Comments
We discuss here a large class of big data problems where MapReduce can't be used - not in a straightforward way at least - and we propose a rather simple analytic, statistical solution.
MapReduce is a technique that splits big data sets into many smaller ones, process each small data set separately (but simultaneously) on…Continue
Added by Vincent Granville on February 1, 2013 at 10:30am — No Comments
Downloadable Aster virtual images provide a free evaluation version of the Aster analytic platform that can be run on a PC.
While this Express edition is not licensed for production use, it is a fully functional Teradata Aster cluster that is an excellent product for developers and testers or anyone who wants a hands-on introduction to our Big Data analytics…
Added by Michel Bruley on February 1, 2013 at 7:07am — No Comments