Big Data News is one of Data Science Central channels. Below is a selection of popular articles published a while back:Continue
Added by Vincent Granville on June 8, 2017 at 7:00pm — No Comments
Glimpsing the Far Side—How Healthcare Organizations are Applying Predictive Analytics
Guest blog post by Paul Bradley
Anyone who works in healthcare—or who has pondered, even fleetingly, how it differs from other sectors of the American economy—likely won’t be surprised to hear it’s the final frontier for predictive analytics.
For indeed, while predictive analytics has long since reshaped retail, shipping & logistics, and even…Continue
Added by Vincent Granville on August 7, 2015 at 10:30am — No Comments
Nice infographics created by the Technology Services Group. TSG have also produced a blog post to complement the infographic, which you may find useful. It talks around how much technology has shrunk over the years and yet its power has grown.…Continue
Added by Vincent Granville on May 5, 2015 at 9:26am — No Comments
Originally posted here.
Retailers know they need Big Data and are charging forward to get in the game. But many retailers continue to face challenges. What type of data should be collected? How should the data be used to generate insights? How do I measure ROI?
101data recently surveyed US retailers, across a range of…Continue
Added by Vincent Granville on April 23, 2015 at 9:47am — No Comments
Infographics by Adeptia. You’re probably familiar with the terms byte, megabyte, and gigabyte — but do you know what a terabyte is? How about a petabyte, or an exabyte?…Continue
Added by Vincent Granville on April 15, 2015 at 12:15pm — No Comments
Added by Vincent Granville on March 19, 2015 at 3:00pm — No Comments
Let's say you have to cluster 10 million points, for instance keywords. You have a dissimilarity function, available as a text file with 100,000,000 entries, each entry consisting of three data points:
Keyword A, Keyword B, distance between A and B denoted as d(A,B)
So, in short, you can perform k-NN (k-nearest neighbors) clustering or some other types of clustering, which typically is O(n^2) or worse, from a computational complexity point of view.…
Added by Vincent Granville on January 27, 2015 at 10:54am — No Comments
Guest blog post by Bernard Marr, first published here.
The field of Big Data requires more clarity and I am a big fan of simple explanations. This is why I have attempted to provide simple explanations for some of the most important technologies and terms you will come across if you’re looking at getting into big…Continue
There has been a few people questioning the value of big data recently, and predicting that big data is going to get smaller in the future. While most of these would-be oracles are traditional statisticians working on small data and worried about their career, or practitioners in small countries (Canada and France in particular) who do not have access to big data, I was surprised to see Mike Jordan - a famous machine learning professor at Berkeley - …Continue
This article was originally posted on Wikibon. Here I selected a few out of the dozens statistics. Enjoy the reading, and visit the original article: it also features a nice infographic on big data. It would be interesting to add stats about sensor data, or data used in engineering (NASA etc.) For instance, how many data points are used to make weather forecasts? How many synthetic molecules are simulated each…Continue
Added by Vincent Granville on October 21, 2014 at 3:00pm — No Comments
Defining big data is now a hot topic. Berkeley University posted 40 very short definitions by thought leaders (including me). Here our goal is to offer a very detailed, comprehensive definition that (hopefully) suits everyone.…Continue
Added by Vincent Granville on October 8, 2014 at 8:52am — No Comments
The Big Data Market: 2014 - 2020 - Opportunities, Challenges, Strategies, Industry Verticals and Forecasts. Report published by Signals and Systems Telecom.
Release Date: June 2014
Number of Pages: 289
Number of Figures: 86
Added by Vincent Granville on July 1, 2014 at 9:47am — No Comments
Added by Vincent Granville on June 30, 2014 at 1:30pm — No Comments
Here are the new additions to our webinar series:
1. From Data Silos to Data Lakes - July 8
The evolution from data silos to information infrastructure ecosystems, known as data lakes, will accelerate data-driven insights, application development and time to value. In this webinar…Continue
Added by Vincent Granville on June 18, 2014 at 10:41am — No Comments
Big data is not expensive. You can process 10 terabytes of data per year on collocated servers using open source tools (Python - I do it in Perl), using your own home-made Hadoop system if needed, to score 100 billion transactions, all for less than $1,000 per year. It requires a bit of…Continue
Added by Vincent Granville on February 15, 2014 at 10:42am — No Comments
Go ahead, be skeptical about big data. The author was—at first.
When the term “big data” first came on the scene, bestselling author Tom Davenport (Competing on Analytics, Analytics at Work) thought it was just another example of technology hype. But his research in the years that followed changed his…
Added by Vincent Granville on February 6, 2014 at 10:00am — No Comments
Denying that big data is a new paradigm (post year 2000) is like saying that the human population has been huge for a long time: if we can handle 10 million human beings as we did a few thousand years ago, we can handle 10 billion today the same way, even one trillion. It's the same as saying that data flowing at 10 million rows per day can be processed and analyzed the same way as 10 billion or one trillion per…Continue
Investors have pumped $3.6 billion into startups focused on big data this year. Not too shabby: It’s almost three quarters of all the money that’s gone into such companies from 2008 to 2012, according to a new infographic out today from burgeoning site Big Data…Continue
Added by Vincent Granville on December 18, 2013 at 12:49pm — No Comments
Added by Vincent Granville on November 20, 2013 at 4:30pm — No Comments
This is the full resolution GDELT event dataset running January 1, 1979 through March 31, 2013 and containing all data fields for each event record. The years 1979 through 2005, inclusive, are available as yearly downloads containing all records for each year, while starting in January 2006 data is available as monthly downloads due to the larger number of records per month over time.…Continue