Subscribe to our Newsletter

Featured Blog Posts – February 2013 Archive (10)

Fast clustering algorithms for massive datasets

Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order O(n) and even faster such as O(n/log n). This is much faster than good Hierarchical Agglomerative Clustering…

Continue

Added by Vincent Granville on February 23, 2013 at 10:00pm — 4 Comments

[Book] Mining of Massive Data Sets

Publication Date: December 30, 2011 | ISBN-10: 1107015359 | ISBN-13: 978-1107015357…

Continue

Added by Vincent Granville on February 20, 2013 at 5:30pm — 1 Comment

Do You Really Need a Data Scientist?

The buzz in the big data world from 2012 is around the new position, Data Scientist.

But is this just hype—a new title for what is really a statistician or SQL power-user position?

Or is this a new way to approach big data—one that is critical to your success?

 

Below a 57’ conference try to response to the question:

http://bcove.me/0ahe6mmb

 

Added by Michel Bruley on February 15, 2013 at 12:57am — 1 Comment

Big Data Journal Inauguration

Editor-in-Chief: Edd Dumbill

ISSN: 2167-6461 • Published Quarterly • Online ISSN: 2167-647X

Current Volume: 1…

Continue

Added by Vincent Granville on February 14, 2013 at 12:00pm — No Comments

Is Big Data an opportunity or a threat for developing countries?

On the one hand, the advent of Big Data delivers the cost-effective prospect to improve decision-making in critical development areas such as health care, employment, economic productivity, crime and security, and natural disaster and resource management. This provides a wealth of opportunities for developing countries. On the other hand, all the well-known caveats of the Big Data debate, such as privacy concerns…

Continue

Added by Martin Hilbert on February 9, 2013 at 12:11pm — No Comments

53.5 billion clicks dataset available for benchmarking and testing

To foster the study of the structure and dynamics of Web traffic networks, we make available a large dataset (‘Click Dataset’) of about 53.5 billion HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows one to examine…

Continue

Added by Vincent Granville on February 8, 2013 at 11:13am — No Comments

What’s next for Big Data?

If Big Data is the way of the future, who is driving that future, how are they doing it and what can today's businesses do to be a part of this next wave?

In an effort to collaboratively explore this opportunity, I began a LinkedIn group dedicated to the discussion around Big Data’s future. The group is a growing community of analysts, developers, vendors, journalists and…

Continue

Added by Radhika Subramanian on February 8, 2013 at 10:30am — No Comments

Four Takeaways From the Big Data Paradigm Shift White Paper

The Big Data Analytics revolution is underway. This revolution is an historic and game-changing expansion of the role that information plays in business, government and consumer realms. To harness the power of this data revolution, a paradigm shift is required. Organizations must be able to do more than query their Big Data stores; search is no longer enough. We …

Continue

Added by Radhika Subramanian on February 1, 2013 at 11:04am — No Comments

What MapReduce can't do

We discuss here a large class of big data problems where MapReduce can't be used - not in a straightforward way at least - and we propose a rather simple analytic, statistical solution.

MapReduce is a technique that splits big data sets into many smaller ones, process each small data set separately (but simultaneously) on…

Continue

Added by Vincent Granville on February 1, 2013 at 10:30am — No Comments

MapReduce for VMware Player images

Downloadable Aster virtual images provide a free evaluation version of the Aster analytic platform that can be run on a PC.

While this Express edition is not licensed for production use, it is a fully functional Teradata Aster cluster that is an excellent product for developers and testers or anyone who wants a hands-on introduction to our Big Data analytics…

Continue

Added by Michel Bruley on February 1, 2013 at 7:07am — No Comments

On Data Science Central

© 2019   BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service