Let's say you want to sort some data you found online, dirtily embedded in a webpage. Or simply, you want to change the format of a CSV file quickly. For this kind of one-time tasks, I think writing a script takes more time than the method I'm going to describe in this post!
If you're a Vim or Emacs user, you're probably familiar with using macros for this type of stuff. However, if you're one of those (just like myself) who struggle to even exit from Vim, then this method might come handy.
I'm going to show this in action. Let's say we want to sort the first 10 movies in IMDb top-rated list according to their gross income. We're keeping the number small for demonstration purposes.
The gross value is listed in the IMDb site with a dollar sign at the beginning and M letter at the end. Our task is to clean it from them and also match this value with the movie name in a nice format.
We copy the text from the web page to Notepad++.
Now, we will record the macro that extracts the movie name and gross value from each entry. For that, we need to realize a pattern between the movie name and the gross value we want to extract.
In this case, I do the following observations:
With these observations, we are ready to record our macro.
Notice that I am making use of the keys End
and Home
for navigating through the line (pressing twice for wrapped lines). Also, I am using the Ctrl + Left Arrow
shortcut for skipping an entire word.
Now we run our macro 10 times! For the cases where our data is large, there's an option to run the macro until the end of the file.
Notice that we removed Entry#9 as it didn't have gross value (the extracted value is some part of the vote count, i.e. garbage). We need to be careful with this kind of stuff when working with macros.
Now, I want to import these beautiful data to Python. I'll turn it into a list of tuples, again using macros!
The rest is now just Python scripting: sort the tuples according to the second element and display them back.
That was all! It might look like it's a slow process since I explained in detail, but I believe it's an extremely quick and handy method when you get used to it.