A Data Science Central Community

If a person wishes to relax himself, travelling is probably the best pick for most people. Choosing the right place to stay for your vocation is one of the most important parts in a travel, but how to do so may be a problem. Reading through reviews of a certain hotel may be a good choice, referring to visitors’ experience, you get to know some more specific details about the hotel, however, this method is not comprehensive enough, and reading a bunch of reviews would irritate you. Here is a way I would like to introduce to y’all, easy, fast and accurate, conducting a sentiment analysis.

Sentiment analysis, also known as opinion mining, is the process of computationally identify and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic, product, etc. Is positive, negative, or neutral.

Put this term into application, business parties may use sentiment analysis to find out how its products, or services perform in the market. Implementing basic data analysis may conduct false idea of product performance, and sentiment analysis allows us to dig deeper on customer’s mind.

Just in case if you don’t know how a sentiment analysis chart might look like, here’s a typical example:

I will use Hotel the Serra listed on Tripadvisor.com as an example, and show you how easily this analysis can be done. The procedure is divided into 3 parts.

1^{st} step

We will have to scrape down all the reviews left on the hotel page. Using Python may be hard for most people, we are conducting a way that everyone can use, so in this part, I will use Octoparse to do the work. Octoparse is a very handy web-scraping tool, its point-and-click interface gives users direct views of what they should do, and it requires no-coding knowledge. Using this tool should help you easily finish this step of the whole job.

2^{nd} step

After the extraction, export the data to Excel. These data can be considered as materials, now I’m going to introduce you another tool before we get started on data analysis.

Semantria, a product under LEXALYTICS, its a plug-in for Excel 2013, and it can help running sentiment analysis without user being an expert on modeling. After installation of Semantria, open the file we extracted previous from Octoparse, and analysis can be easily done by following the starter guide of Semantria.

Here are the results:

We can see in the table, there are 3 columns showing results, basically we can look at “Document sentiment +/-” column, and be able to get the direct view of how a hotel be like, however, just through the “Document sentiment +/-” column, we cannot know how good or how bad a hotel is, so we look at the “Document sentiment” column, there are scores showing each review’s rate.

3^{rd} step

In order to make results easier to understand, we can use a data-visualization tool to help us make the results more visible. I used Data-Wrapper in this part, by adapting the column of Document sentiment to the graph, and we got:

We can see from the graph, we have very few lines merely negative, most of the document sentiment scores are positive, and the score range is around 0.4~0.8. This is the result we want to obtain eventually, it costs a very short time to commit a sentiment analysis. We can choose some wishing hotels and conduct sentiment analysis separately, and compare their graphs to make the final decisions.

Actually, we do not need to know the score range, and we don’t have to understand the graph, we can just simply average the document sentiment scores, and pick the highest one.

This article introduces a simple application of sentiment analysis, sentiment analysis can do far more than this. The method I talked about in this article can be adapted in many other places, go find out yourself.

Tools used: Octoparse, Semantra, DataWrapper

© 2019 BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of BigDataNews to add comments!

Join BigDataNews