Blog Post

Sentiment Analysis & Key Phrase Detection w/ Power Query & Power BI

,

Recently I worked on a neat little POC with Patrick Leblanc for a customer in Education who wished to perform sentiment analysis and key phrase extraction on surveys completed by students regarding classes and instructors, which brings me to this blog post.

Using Azure ML and a free subscription to the Text Analytics API, I’m going to show you how to perform sentiment analysis and key phrase extraction on tweets with the hashtag #Colts (after this past Sunday’s 51-16 beat down of the Colts at the hands of the Jacksonville Jaguars, I’m bathing in the tears of Colts fans. Watch the highlights! ). Although my example here is somewhat humorous, the steps can be used to perform sentiment analysis and key phrase extraction on any text data as long as you can get the data into Power Query.

Getting Started with Power BI? Begin here!

To get started there’s a couple things you need to do first:

  1. Subscribe to the Text Analytics API, which you can find here: https://datamarket.azure.com/dataset/amla/text-analytics.
    I’ve subscribed to the free subscription since I don’t need more than 10,000 transactions/month.Text Analytics API subscribe
  2. If you want to analyze tweets like I will be doing, you’ll need to download Plus One Social, a nifty little tool that will extract tweets and load them into an Access database. Download Plus One Social here: http://plusonesocial.com/. Plus One Social downloadUsing Plus One is pretty easy, but if you need instruction on how to use the app, Plus One has a great getting started video on YouTube: https://www.youtube.com/watch?v=viXHXrDGQaA .
After setting up the subscription to the Text Analytics API and downloading the Plus One app, I used Plus One to search for the #Colts hashtag on Twitter. This allowed me to download a total of 5,022 tweets from 3,011 distinct total messengers.
PlusOne tweets downloaded
That should be enough tears for this demo.??

After running the app, an Access database is created with all the info I need to analyze the sentiment of the tweets and extract the key phrases. The Plus One Access database is created by default in my Documents folders in a Plus One Social folder.

Next, I’ll open Excel and create a Power Query to pull in the data with the tweets from the Access database created by the Plus One Social app. Open Excel, go the Data ribbon, click New Query, select From Database, and select the option From Microsoft Access Database:
Get Data from Access db

I’m just selecting the Messages table since I’m really only measuring the sentiment and pulling out the key phrases. Once you select the Messages table, click Edit to open the query in Power Query.

Then I’ll click Choose Columns button to limit the result set to just the MessageID, Message, and CreatedDate columns.

Click OK. I want to do a little data cleansing to remove some unwanted characters that I know will cause some problems for the API, like # and @. Also, I want to remove any tweets that begin with RT indicating it was a retweet. I want to try to only capture the distinct tweets.

To remove any rows beginning with RT, right-click a row and select Text Filters and select Does Not Begin With. Then in the Applied Steps window on the right, click the settings cog next to the Filtered Rows step and change the value to RT. Click OK.

Filter RT rows

I also want to replace the # and @ characters. I can do this by selectin ghte Message column and selecting Replace Values from the Transform ribbon. I’ll replace # with (hashtag) and @ with (at).

Then click Close & Load in the top left of Power Query.

To conduct the sentiment analysis (or the key phrase extraction), we can use the single response API, which you can read about here: https://azure.microsoft.com/en-us/documentation/articles/machine-learning-apps-text-analytics/.

To use the single response API in Power Query, we need to create a parameterized function. To do this, we’ll first go to the Data ribbon in Excel, click New Query, select From Other Sources, and select From OData Feed. In the URL, enter the following URL:
https://api.datamarket.azure.com/data.ashx/amla/text-analytics/v1/GetSentiment?Text=hello+world

Then click OK.

You may be prompted to enter your Azure Marketplace Account key to access the OData feed. Click the Get your Marketplace Account Key hyperlink and copy & paste your account key into the text box. Be aware that this is sensitive data.
Enter marketplace key

After entering in the account key, click Save. You should see a sentiment score, which is the sentiment score for the phrase “hello world”. We need to turn this into a function and parameterize the function.

In Power Query, go to the Home ribbon, click Advanced Editor and use the following text to create the function:

(inputText) =>
let
    Source = OData.Feed("https://api.datamarket.azure.com/data.ashx/amla/text-analytics/v1/GetSentiment?Text=" & (inputText))
in
    Source
Create sentiment score function.png
I renamed the query to be called fxGetSentimentScore, as you can see in the above screen shot. Click Close & Load.
Now we can use our function, fxGetSentimentScore, in our query against the Twitter data. Select the table containing your Twitter messages data and in the Query ribbon click Edit.
Lets add a Custom Column to call our function. Go to the Add Column ribbon in Power Query and select Add Custom Column.
Add custom column.PNG
Name the column Sentiment Score and use the following formula to call our function:
=fxGetSentimentScore([Message])
When we click OK the function will use our subscription to the Text Analytics API to begin analyzing the sentiment of the tweets. This may take a few minutes.
When its finished, click the arrows above the new SentimentScore column, select the Score column and click OK to expand the score results.
Expande sentiment score column.PNG
Now you should see the sentiment scores for each tweet. Click Close & Load in the top left. Again, this could take a few minutes. With the sentiment score in Excel, we can dump the results into our reporting tool of choice to see what people are saying about the #Colts!:)
View sentiment scores
*I apologize for the foul language, there were some angry fans!
We can also perform the key phrase extraction very similarly to the sentiment analysis, except we create a different function which you can see here:
(inputText) =>
let
    Source = OData.Feed("https://api.datamarket.azure.com/data.ashx/amla/text-analytics/v1/GetKeyPhrases?Text=" & (inputText))
in
    Source
So what I do now is create the new function just like we did before and call it fxGetKeyPhrase. Then I import the Twitter messages again and perform the key phrase analysis on the messages, which you can see the results here:
Key phrase extractions

With that complete, the only logical step now is to dump it into Power BI, of course!

Power BI report

I thought that was pretty cool and worth sharing. Patrick Leblanc recently posted a blog on how to do the same thing I just did within Power BI. You can read Patrick Leblanc’s blog on Sentiment Analysis with Power BI here: http://patrickdleblanc.com/wordpress/?p=41.

Resources

Subscribe to the Text Analytics API I used here.

Here’s the documentation on the Sentiment Analysis and Key Phrase extraction here. This is helpful!
Download Plus One Social here.
Download the Power BI Word Cloud and other custom visuals here.

Feedback?

I hope you found this helpful! If you have any questions or feedback, feel free to leave it in the comments!

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating