A Data Science Central Community
Added by Vincent Granville on March 18, 2013 at 9:00am — No Comments
This is the first of a series of blogs in which I will reveal and explain rules of intelligence contained within grammar, that can be utilized to unleash intelligence in software. These rules are extremely simple, but still undiscovered by scientists.
Systems generating questions already exists. However, their questions are useless, because the original sentence - from which the question is derived - already holds to the answer.
Added by Menno Mafait on March 9, 2013 at 12:30am — No Comments
EMC are aiding in astronomical discoveries, digitizing 200,000+ star plates for research.
Added by Vincent Granville on March 8, 2013 at 3:00pm — No Comments
From Data Science Central, including Big Data News and AnalyticBridge:
Added by Vincent Granville on March 1, 2013 at 9:00am — No Comments
Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order O(n) and even faster such as O(n/log n). This is much faster than good Hierarchical Agglomerative Clustering…Continue
The buzz in the big data world from 2012 is around the new position, Data Scientist.
But is this just hype—a new title for what is really a statistician or SQL power-user position?
Or is this a new way to approach big data—one that is critical to your success?
Below a 57’ conference try to response to the question:
Editor-in-Chief: Edd Dumbill
ISSN: 2167-6461 • Published Quarterly • Online ISSN: 2167-647X
Current Volume: 1…
Added by Vincent Granville on February 14, 2013 at 12:00pm — No Comments
On the one hand, the advent of Big Data delivers the cost-effective prospect to improve decision-making in critical development areas such as health care, employment, economic productivity, crime and security, and natural disaster and resource management. This provides a wealth of opportunities for developing countries. On the other hand, all the well-known caveats of the Big Data debate, such as privacy concerns…Continue
Added by Martin Hilbert on February 9, 2013 at 12:11pm — No Comments
To foster the study of the structure and dynamics of Web traffic networks, we make available a large dataset (‘Click Dataset’) of about 53.5 billion HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows one to examine…Continue
Added by Vincent Granville on February 8, 2013 at 11:13am — No Comments
If Big Data is the way of the future, who is driving that future, how are they doing it and what can today's businesses do to be a part of this next wave?
In an effort to collaboratively explore this opportunity, I began a LinkedIn group dedicated to the discussion around Big Data’s future. The group is a growing community of analysts, developers, vendors, journalists and…Continue
Added by Radhika Subramanian on February 8, 2013 at 10:30am — No Comments
The Big Data Analytics revolution is underway. This revolution is an historic and game-changing expansion of the role that information plays in business, government and consumer realms. To harness the power of this data revolution, a paradigm shift is required. Organizations must be able to do more than query their Big Data stores; search is no longer enough. We …Continue
Added by Radhika Subramanian on February 1, 2013 at 11:04am — No Comments
We discuss here a large class of big data problems where MapReduce can't be used - not in a straightforward way at least - and we propose a rather simple analytic, statistical solution.
MapReduce is a technique that splits big data sets into many smaller ones, process each small data set separately (but simultaneously) on…Continue
Added by Vincent Granville on February 1, 2013 at 10:30am — No Comments
Downloadable Aster virtual images provide a free evaluation version of the Aster analytic platform that can be run on a PC.
While this Express edition is not licensed for production use, it is a fully functional Teradata Aster cluster that is an excellent product for developers and testers or anyone who wants a hands-on introduction to our Big Data analytics…
Added by Michel Bruley on February 1, 2013 at 7:07am — No Comments
I’m very pleased to announce a new addition to the DSC community – Big Data News! I’d like to personally invite you to join me on our latest Data Science Central Community Channel dedicated to all things Big Data. You can click on this link to join me and participate in our growing community of Big Data…Continue
Added by Vincent Granville on January 31, 2013 at 10:40am — No Comments
This seminal article highlights the dangers of reckless applications and scaling of data science techniques that have worked well for small, medium-size and large data. We illustrate the problem with flaws in big data trading, and propose solutions. Also, we believe expert data scientists are more abundant (but very different) than what hiring companies claim: read our "related articles" section at the bottom for more details. This article is written in simple…Continue
Added by Vincent Granville on January 31, 2013 at 10:38am — No Comments
This is an interesting editorial from Justin LaFayette which I believe you may find of value.
The current media hype around Big Data has clouded where the real opportunity is: software applications that exploit Big Data.
The infrastructure and platform plays that are grabbing the headlines are critical, but they won’t create the same long-term value as those entrepreneurs who figure out how to apply Big Data to the task of disrupting or accelerating a…Continue
Added by Frank Hawthorn on January 28, 2013 at 4:00pm — No Comments