To start with Sentiment Analysis, what comes first to our mind is where and how we can crawl oceans of data for our analysis. Normally, web crawler or crawling from web social media should be one reasonable way to get access to the public opinion data resource. Thus, in this writing, I want to share with you about how I crawled the website using web crawler and proceeded to deal with those data for…Continue
Added by Paul Black on February 28, 2017 at 10:00pm — No Comments
Advanced analytics continues to permeate more functional areas of the enterprise. From marketing campaigns and sales optimization to supply chain and human capital management, business users are deploying newer, easier to use…Continue
Added by Gabriel Lowy on April 11, 2017 at 8:00am — No Comments
A recent LinkedIn post linking to an Innovation Enterprise article entitled 'Hadoop Is Failing' certainly got our attention, as you might expect.
Apart from disagreeing with the assertion that 'Hadoop...is very much the foundation on which data today is built' the main thrust of the article…Continue
Added by Richard Jackson on April 14, 2017 at 12:30am — No Comments
Many social media, like Twitter, Facebook and etc, are evolving to become a source of information for people to scrape varied kinds of data, since microblogs on which users post real time messages shows millions of opinions about their attitudes or sentiment towards hot topics and current issues. Recently, I decided to learn how Regional sentiment analysis can help people to make specific decisions or policy…Continue
Added by Paul Black on March 20, 2017 at 2:30am — No Comments
(picture from www.re-work.co)
Most people keep close eyes on the top of the fast-moving technology trends. There’s no doubt that deep learning is most trending buzzwords today. Deep learning has made a significant breakthrough and is applied in many areas like facial recognition, recognizing images and AlphaGo Games. Thus…Continue
Added by Paul Black on December 14, 2016 at 11:30pm — No Comments
As a part of Twitter Data Analysis, So far I have completed Movie review using R& Document Classification using R. Today we will be dealing with discovering topics in Tweets, i.e. to mine the tweets data to discover underlying topics– approach known as Topic Modeling.
Added by suresh kumar gorakala on December 23, 2015 at 8:30pm — No Comments
Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order O(n) and even faster such as O(n/log n). This is much faster than good Hierarchical Agglomerative Clustering…Continue
Added by Vincent Granville on November 18, 2013 at 10:30am — No Comments
This is the full resolution GDELT event dataset running January 1, 1979 through March 31, 2013 and containing all data fields for each event record. The years 1979 through 2005, inclusive, are available as yearly downloads containing all records for each year, while starting in January 2006 data is available as monthly downloads due to the larger number of records per month over time.…Continue
Yes we know that you will be having a lots of queries such as Collection of Big Data, How organizations gather Big Data, how to gather information for quantitative research so don't stress, in the event that you are here to hunt down these questions here then you are on the right page as here we are going to give you a complete article on Collection of Big Data strategies quickly. …Continue
Added by Ayushi Mishra on November 4, 2016 at 3:00am — No Comments
Web scraping (also termed web data extraction, screen scraping, or web harvesting) is a web technique of extracting data from the web, and turning unstructured data on the web into structured data that can stored to your local computer or a database.
The web scraping technique is implemented by web scraping software tools. These tools interacts with websites in the same way as you do when using a…Continue
Added by Paul Black on September 22, 2016 at 11:00pm — No Comments
The modern world seems really fast and dynamic with a multitude of new products being launched. Marketing agencies are making fortune by monitoring the markets and delivering reports on consumers’ opinions. For today, the feedback analysis is a separate area, let’s say a growing industry with an array of products and services. And the prices for those services are pretty exorbitant.
So, do vendors have a chance to cut down…Continue
Added by Yana Yelina on August 12, 2016 at 12:00am — No Comments
Big Data is an accumulation of data that is too large and complex for processing by traditional database management tools.
Yeah But, What Really Makes Big Data Big Data? This question is as fundamental to data science as the chicken/egg question should be to researchers at KFC. But we’re not dealing with an A/B chicken model here. It’s more elephant to the dark room or scaling it up, the nearest star to our galactic…Continue
Added by Orion Stallard on July 8, 2016 at 12:54pm — No Comments
I want to share an interesting article about data scaping that you might need in your business. The article below is mainly reprinted from here.
Text in the HTML document is the content that placed between HTML tags like <a> </a> , <title> </title>. Sometimes we want to extract the text in the HTML document and there are two methods that can…Continue
Added by Nora Choi on May 31, 2016 at 2:30am — No Comments
Yarn Resource manager (The Yarn service Master component)
1) Controls of the total resource capacity of the cluster
2) Whatever the container is needed in the cluster it sets the minimum container size that is controlled by yarn configuration property
àyarn.scheduler.minimum-allocation-mb 1024(This value changes based on cluster ram capacity)
Description: The minimum allocation for every container request at the RM, in MBs.…Continue
Added by skumar T on May 30, 2016 at 8:00pm — No Comments
What does The Library of Alexandria, The Normans and a book have to do with data? I never thought about
...at Alexandria was in charge of collecting all the world's knowledge, and most of the staff was occupied with the task of translating works onto papyrus paper... 1
Or The Normans and the...
Domesday Book (Latin: Liber de Wintonia "Book of…
Added by George Psistakis on May 20, 2016 at 5:20am — No Comments
As a central repository and processing engine, data lakes hold great promise for raising return on data assets (RDA). Bringing analytics directly to different data in its native formats can accelerate time-to-value by…Continue
Added by Gabriel Lowy on April 11, 2016 at 12:00pm — No Comments
As we evolve toward a software-defined world, there’s a new user experience urgency emerging. That’s because the definition of “user” is going to be vastly expanded. In the Internet of Things (IoT)…Continue
Added by Gabriel Lowy on March 30, 2016 at 9:43am — No Comments
Is your company poised to take advantage of three key trends in Big Data? Syncsort, a global leader in Big Data and mainframe software, recently released the results of its second annual Hadoop survey. Based on the survey results there are three areas that companies will focus on in 2016, to realize the full potential of Big Data analytics.
First, Apache Spark will move from a talking point into deployment. Nearly 70 percent of survey respondents are interested in Apache…Continue
Added by John McCure on January 22, 2016 at 4:00pm — No Comments
Curse of Dimensionality:
One of the most commonly faced problems while dealing with data analytics problem such as recommendation engines, text analytics is high-dimensional and sparse data. At many times, we face a situation where we have a large set of features and fewer data points, or we have data with very high feature vectors. In such scenarios, fitting a model to the dataset, results in lower predictive power of the model. This scenario is often termed as…Continue
Added by suresh kumar gorakala on February 28, 2016 at 9:30pm — No Comments