A Data Science Central Community
If a person wishes to relax himself, travelling is probably the best pick for most people. Choosing the right place to stay for your vocation is one of the most important parts in a travel, but how to do so may be a problem. Reading through reviews of a certain hotel may be a good choice, referring to visitors’ experience, you get to know some more specific details about the hotel, however, this method is not comprehensive enough, and reading a bunch of reviews would irritate you. Here is a way I would like to introduce to y’all, easy, fast and accurate, conducting a sentiment analysis.
Sentiment analysis, also known as opinion mining, is the process of computationally identify and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic, product, etc. Is positive, negative, or neutral.
Put this term into application, business parties may use sentiment analysis to find out how its products, or services perform in the market. Implementing basic data analysis may conduct false idea of product performance, and sentiment analysis allows us to dig deeper on customer’s mind.
Just in case if you don’t know how a sentiment analysis chart might look like, here’s a typical example:
I will use Hotel the Serra listed on Tripadvisor.com as an example, and show you how easily this analysis can be done. The procedure is divided into 3 parts.
We will have to scrape down all the reviews left on the hotel page. Using Python may be hard for most people, we are conducting a way that everyone can use, so in this part, I will use Octoparse to do the work. Octoparse is a very handy web-scraping tool, its point-and-click interface gives users direct views of what they should do, and it requires no-coding knowledge. Using this tool should help you easily finish this step of the whole job.
After the extraction, export the data to Excel. These data can be considered as materials, now I’m going to introduce you another tool before we get started on data analysis.
Semantria, a product under LEXALYTICS, its a plug-in for Excel 2013, and it can help running sentiment analysis without user being an expert on modeling. After installation of Semantria, open the file we extracted previous from Octoparse, and analysis can be easily done by following the starter guide of Semantria.
Here are the results:
We can see in the table, there are 3 columns showing results, basically we can look at “Document sentiment +/-” column, and be able to get the direct view of how a hotel be like, however, just through the “Document sentiment +/-” column, we cannot know how good or how bad a hotel is, so we look at the “Document sentiment” column, there are scores showing each review’s rate.
In order to make results easier to understand, we can use a data-visualization tool to help us make the results more visible. I used Data-Wrapper in this part, by adapting the column of Document sentiment to the graph, and we got:
We can see from the graph, we have very few lines merely negative, most of the document sentiment scores are positive, and the score range is around 0.4~0.8. This is the result we want to obtain eventually, it costs a very short time to commit a sentiment analysis. We can choose some wishing hotels and conduct sentiment analysis separately, and compare their graphs to make the final decisions.
Actually, we do not need to know the score range, and we don’t have to understand the graph, we can just simply average the document sentiment scores, and pick the highest one.
This article introduces a simple application of sentiment analysis, sentiment analysis can do far more than this. The method I talked about in this article can be adapted in many other places, go find out yourself.
Tools used: Octoparse, Semantra, DataWrapper