A Data Science Central Community

**Curse of Dimensionality**:

One of the most commonly faced problems while dealing with data analytics problem such as recommendation engines, text analytics is high-dimensional and sparse data. At many times, we face a situation where we have a large set of features and fewer data points, or we have data with very high feature vectors. In such scenarios, fitting a model to the dataset, results in lower predictive power of the model. This scenario is often termed as…

ContinueAdded by suresh kumar gorakala on February 28, 2016 at 9:30pm — No Comments

Today I will explain you how to create a basic Movie review engine based on the tweets by people using R. The implementation of the Review Engine will be as follows:

- Gets Tweets from Twitter
- Clean the data
- Create a Word Cloud
- Create a data dictionary
- Score each tweet.

**Gets Tweets from Twitter:**

First step is to fetch the data from Twitter. In R, we have facility to call the twitter API using package…

Continue
Added by suresh kumar gorakala on January 11, 2016 at 6:00am — No Comments

As a part of Twitter Data Analysis, So far I have completed Movie review using R& Document Classification using R**. **Today we will be dealing with discovering topics in Tweets, i.e. to mine the tweets data to discover underlying topics– approach known as Topic Modeling.

Added by suresh kumar gorakala on December 23, 2015 at 8:30pm — No Comments

*Originally posted here. *

In our day to day life, we come across a large number of Recommendation engines like Facebook Recommendation Engine for Friends’ suggestions, and suggestions of similar Like Pages, Youtube recommendation engine suggesting videos similar to our previous searches/preferences. In today’s blog post I will explain how to build a basic…

Continue
Added by suresh kumar gorakala on October 13, 2015 at 6:30am — No Comments

In my previous blog I have explained about linear regression. In today’s post I will explain about logistic regression.

Consider a scenario where we need to predict a medical condition of a patient (HBP) ,HAVE HIGH BP or NO HIGH BP, based on some observed symptoms – Age, weight, Issmoking, Systolic value, Diastolic value, RACE, etc.. In this…

Added by suresh kumar gorakala on October 9, 2015 at 9:13am — No Comments

- big (4)
- data (4)
- learning (3)
- machine (3)
- science (3)
- R (2)
- Analysis (1)
- Component (1)
- Principal (1)
- algorithm (1)
- algorithms (1)
- algorithms. (1)
- datascience (1)
- datascientist (1)
- machinelearning (1)
- modelling (1)
- r (1)
- statistics (1)
- topic (1)

© 2019 BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions