Recently I had to go through some training that asked me to evaluate some text. I needed to tally some data up, and decided Word and Excel were quick and dirty ways to do this.
The Task
The instructions were:
- Read the text
- Put a check by the sentences you agree with
- Put a question mark by those you are unsure about
- Leave others blank
- Determine how much in agreement you are with the text
The challenge was this was 5-6 paragraphs, each with 3-5 sentences, so trying to mark these on the screen and then tally them wasn’t easy. Being a data person, I wanted to calculate something more accurate and easy.
So, I thought about getting this data into a better format and wanting to use math to calculate percentages. While there are lots of ways to split text in T-SQL, Python, etc. I decided for a one-off, Word and Excel worked well.
Word Formatting
I started in Word because, well, this is text and Word is for text. I’ve also pasted lots of text into Excel and it often keeps enough formatting that things are combined into one cell.
Since I needed sentences, I pasted text and did a search and replace for a period and a space, as shown inbrackets [. ]. The replacement was more complex. I tried n and /n and a few things, then I noticed the “More” button at the bottom.
When I expanded this, I saw a number of options, and I used the paragraph mark, since I wanted each sentence on a separate line.
This isn’t the text I was using, but I grabbed this from sqlsaturday.com for this post. Here’s the before:
Here’s the after:
It was well formatted text, so this gave me the split of data into sentences.
Excel for the Math
I then copy/pasted this into Excel, which put each sentence into a separate cell in a column. I had a few blanks, but easy to sort the data (ordering didn’t matter here) and get all my data at the top.
From here, I just added my marks to the columns at the end. I also added a title row, just for me. I then used a “1” for agreement in the columns. For those where I wasn’t marking either column, I used two zeros, just to make things look better. I had something like this:
The formula I used was a sum, divided by the count. For the count, I just used the final row number (9), subtracting the title row. This way I could copy this across each column.
Easy way to calculate some one-off totals from data that I need to evaluate for text data with a little Office ELT.