Subscribe to our Newsletter

Featured Blog Posts (194)

Invitation to join IoT Central

Sign up here to receive (at no cost) our IoT Central weekly digest and full access to our professional network. Alternatively, click here if you are only interested in the newsletter.

The full membership includes, in addition to the newsletter…

Continue

Added by Vincent Granville on November 29, 2017 at 11:13am — No Comments

Big Data: 50 Fascinating and Free Data Sources for Data Visualization

Have you ever felt frustrated when try to look for some data on Google? Pages of relevant websites but none can fulfill your expectation? Have you ever felt that your articles are less persuasive without data support?

General…
Continue

Added by Paul Black on October 30, 2017 at 7:30pm — No Comments

Graph Theory: Six Degrees of Separation Problem

This famous statement -- the six degrees of separation -- claims that there is at most 6 degrees of separation between you and anyone else on Earth. Here we feature a simple algorithm that simulates how we are connected, and indeed confirms the claim. We also explain how it applies to web crawlers: Any web page is connected to any other web page by a path of 6 links at most.

The algorithm below is rudimentary and can be used for simulation purposes by any programmer: It does not even…

Continue

Added by Vincent Granville on October 24, 2017 at 11:30pm — No Comments

To Index Data Is to Sort Data

Indexing is commonly used among programmers. Without fully grasping the idea behind the technique, a programmer is always eager to take advantage of it whenever they encounter a query performance problem, only to get disappointed by the result on many occasions. By analyzing the principle of indexing, the article tries to show programmers when is the appropriate time to use an index and how to use it.

 

Basic idea

The purpose of indexing is to quickly find…

Continue

Added by JIANG Buxing on August 29, 2017 at 12:30am — No Comments

The Data Computing Layer in Reporting Architecture

By JIANG Buxing

In the previous article, we discussed the necessity of the existence of a computing layer in the reporting architecture. Reporting tools support the user-defined interface-based programming with its host language (i.e. the programming language used for developing a reporting tool) to achieve the functionality of a computing layer for implementing complex computational logics, but the strategy reveals some real-life problems. An explicit data computing layer…

Continue

Added by JIANG Buxing on August 24, 2017 at 10:30pm — No Comments

Find best hotel for vacation with Sentiment Analysis

If a person wishes to relax himself, travelling is probably the best pick for most people. Choosing the right place to stay for your vocation is one of the most important parts in a travel, but how to do so may be a problem. Reading through reviews of a certain hotel may be a good choice, referring to visitors’ experience, you get to know some more specific details about the hotel, however, this method is not comprehensive enough, and reading a bunch of reviews would irritate you. Here is a…

Continue

Added by Zhouyiming on August 28, 2017 at 12:00am — No Comments

The Largest Number Ever Created

There are numbers that are so large that there is no compact formula to represent them. Think of a number so large, that its number of digits is so large, that the number of digits of its number of digits is so large... and it goes on and on -- you get the idea.

Sure, if you are able to define such a number, then add one, or even 0.5, and you get an even bigger number. But this is not the point. The issue is to come up with such massive numbers in the first place. The biggest…

Continue

Added by Vincent Granville on August 16, 2017 at 1:00pm — No Comments

How Marketers Use Data Science to Increase Reach

Interesting Infographics produced by Villanova University.

infograpic

Originally posted here

DSC Resources

Continue

Added by Vincent Granville on August 21, 2017 at 9:34am — No Comments

How data analytics is transforming the Health care industry

 

 

There is an estimated 50 Petabytes of data in the health care realm, predicted to grow to 25,000 Petabytes by 2020, reported by a new info-graphic from Oracle. From this astonishing data report, we can see that the healthcare industry is generating a huge amount of data, driven by clinical records, medical care and compliance & regulatory requirements.

Luckily, big data analytic application has been widely used in…

Continue

Added by Paul Black on August 10, 2017 at 7:00pm — No Comments

How Marketers Use Data Analytics to Reach New and Existing Customers (Infographic)

Big data and analytics can help a business predict consumer behavior, improve decision-making across the board and determine the ROI of its marketing efforts. By addressing these aspects adequately, the business would not only be able to protect its market share, but also expand into new territories. The below infographic by Villanova University School of Business Online takes a detailed look at this…

Continue

Added by Jay Taylor on September 1, 2017 at 8:00pm — 1 Comment

Can Science Create a System to Win at Roulette?

Probability and physics are helping make even roulette seem ultimately predictable.

In his new book, The Perfect Bet: How Science and Math Are Taking the Luck Out of Gambling, Adam Kucharski details how trying to understand dice games led one mathematician to develop probability theory,…

Continue

Added by Edward Turner on July 19, 2016 at 4:30pm — No Comments

12 Great Articles from Big Data News

Big Data News is one of Data Science Central channels. Below is a selection of popular articles published a while back:

Continue

Added by Vincent Granville on June 8, 2017 at 7:00pm — No Comments

Website Crawler & Sentiment Analysis

            

 

To start with Sentiment Analysis, what comes first to our mind is where and how we can crawl oceans of data for our analysis. Normally, web crawler or crawling from web social media should be one reasonable way to get access to the public opinion data resource. Thus, in this writing, I want to share with you about how I crawled the website using web crawler and proceeded to deal with those data for…

Continue

Added by Paul Black on February 28, 2017 at 10:00pm — No Comments

Advanced Analytics Give CFOs More Clout

Advanced analytics continues to permeate more functional areas of the enterprise.  From marketing campaigns and sales optimization to supply chain and human capital management, business users are deploying newer, easier to use…

Continue

Added by Gabriel Lowy on April 11, 2017 at 8:00am — No Comments

Is Hadoop Failing?

'Hadoop Is Failing' Article

A recent LinkedIn post linking to an Innovation Enterprise article entitled 'Hadoop Is Failing' certainly got our attention, as you might expect.

Apart from disagreeing with the assertion that 'Hadoop...is very much the foundation on which data today is built' the main thrust of the article…

Continue

Added by Richard Jackson on April 14, 2017 at 12:30am — No Comments

Web Scraping Service & OVR Classification based on Twitter in Machine Learning

     

 

Many social media, like Twitter, Facebook and etc, are evolving to become a source of information for people to scrape varied kinds of data, since microblogs on which users post real time messages shows millions of opinions about their attitudes or sentiment towards hot topics and current issues. Recently, I decided to learn how Regional sentiment analysis can help people to make specific decisions or policy…

Continue

Added by Paul Black on March 20, 2017 at 2:30am — No Comments

The Best Answers to Your Most Crucial Deep Learning Questions

(picture from www.re-work.co)

Most people keep close eyes on the top of the fast-moving technology trends. There’s no doubt that deep learning is most trending buzzwords today. Deep learning has made a significant breakthrough and is applied in many areas like facial recognition, recognizing images and AlphaGo Games. Thus…

Continue

Added by Paul Black on December 14, 2016 at 11:30pm — No Comments

Topic Modeling in R

As a part of Twitter Data Analysis, So far I have completed Movie review using RDocument Classification using RToday we will be dealing with discovering topics in Tweets, i.e. to mine the tweets data to discover underlying topics– approach known as Topic Modeling.

What is Topic…
Continue

Added by suresh kumar gorakala on December 23, 2015 at 8:30pm — No Comments

Fast clustering algorithms for massive datasets

Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order O(n) and even faster such as O(n/log n). This is much faster than good Hierarchical Agglomerative Clustering…

Continue

Added by Vincent Granville on February 23, 2013 at 10:00pm — 4 Comments

Big data set - 3.5 billion web pages - made available for all of us

This page provides a large hyperlink graph for public download. The graph has been extracted from the Common Crawl 2012 web corpus and covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, the graph is the largest hyperlink graph that is available to the public outside companies…
Continue

Added by Vincent Granville on November 18, 2013 at 10:30am — No Comments

© 2017   BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap   Powered by

Badges  |  Report an Issue  |  Terms of Service