OK, maybe not the only software you’ll ever need, but only maybe.
At least it’s all you need to implement any Practical Economics analysis provided on this blog. It’s what we use. If anything extra is required for anything we post we’ll let you know how to install it at the time.
What is Python? According to the boilerplate text:
Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python’s design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects
All you need to know is that Python trades off a bit of performance (speed) for a large increase in readability. It’s much closer to common-use language than other programming languages out there. It also can be used for so many things that knowing how to use it gives you a breadth of uses.
And yes it’s named after Monty Python – computer programmers will be computer programmers!
There is an incredible amount of free support on the internet that is more specific and searchable than a users manualwritten as an afterthought. For example, at the time of writing Stack Overflow had over 1.25 million questions on Python, all answered by very smart people for no reward other than a bit of prestige and goodwill.
If you have a problem, chances are someone else has already had it, solved it and posted it on the internet.
And it’s free!
The downside is that it’s not as easy to install as many commercial packages and every now and again you have to poke and prod the code a bit to find the right way to do something.
But it’s worth it.
Don’t think you need more than Excel? There are two main advantage to moving from Excel to a coding-based method:
- the amount and complexity of the calculations you can do on very large data sets with a few simple commands is much greater than Excel is capable of; and
- well written code/instructions are far easier to follow than playing ‘trace precedents’ tag in Excel.
When we say well written we mean readable and easy to follow by the person reading it after you. Code, like a good writing, should have a large degree of empathy for the reader.
Programming might be a little more intimidating initially, but the rewards are great if you can break your Excel addiction. Personally, we banned ourselves from using Excel for even the simplest calculations and forced ourselves to use Python. It doesn’t take long before you’ll be hooked on more complex stuff.
Also, don’t think of it as coding or programming in the sense of writing professional programmer quality code for commercial use. You’re just writing the functions you need to do the job you want. The hard part is being able to conceptualize what you need.
Don’t worry about the programming snobs telling you your code needs to be Pythonic. As long as it gets the job done it’s fine. And no, you don’t have to know what Object-Orientated Programming means, much less do it.
Like most open-source software, Python is built around a base language (Python) and additional libraries/packages written by insanely smart people in universities and businesses around the world. Once installed, you have to call each package in files that you need it. This is because there are too many packages to be called all of the time without blowing your computer’s working memory.
You’ll obviously need to install some ‘build’ of Python. You will also need the following Python packages:
- pandas – allows incredibly useful data structures (mainly dataframes, think data tables in Excel spreadsheets on steroids) and data processing capability on those structures. We can’t say enough about how useful pandas has been in the Proprietor’s day job.
- numpy – a package that supports large arrays and matrices of numbers containing many useful functions that can be performed on those arrays and matrices. This is the essential numerical bolt-on the Python.
- scipy – useful for statistics/econometrics, linear algebra and optimization (technically pandas and numpy are part of scipy as well). Very handy. The statistics aren’t regarded as highly regarded as R, but they’re still very good.
- matplotlib – a 2-D charting program that will print charts from data stacks large enough to blow up Excel’s brain 10 times over. There is also some seriously clever stuff in here. For example if you’ve ever been frustrated trying to do a step chart in Excel then Matplotlib has the answer for you.
- Pyomo/GLPK solver – this is a real modelling superpackage. With it you can do mixed integer linear programming, perfect for electricity dispatch or computable general equilibrium modelling (the best method for economic impact,and sometimes economic welfare, analysis).
The pandas, numpy, scipy and matplotlib packages come as part of the Anaconda ‘build’ of Python. Pyomo and GLPK are a bit of extra work but well worth it.
To download these programs you’ll need administrator access to your computer.
The Anaconda build of Python contains more packages automatically installed than we at Practical Economics know what to do with – we tend to use just the ones listed above for data analysis.
To install Anaconda, go to the download page, click on the latest version. It’s 3.7 at the time of writing and the default is for 64 bit windows (check your computer). Python 3 was a major change over Python 2, which messed up many support packages and is why 2 is still an option. We will go with the latest.
Just download the exe file, run it and follow the prompts. That’s it, but it will take a while.
You can open Anaconda/Python by going to your computer’s Start Menu, open the Anaconda 3 (64-bit)>>Spyder (Anaconda 3). Pin this to your Task Bar if you want quick access.
Once you have Anaconda, Pyomo can be installed by going to your Start Menu and opening the Anaconda 3 (64-bit)>>Anaconda Prompt (it looks like an old DOS prompt) as below.
Once the Anaconda Prompt is open, type in either of the two commands (you don’t have to change the directory):
conda install -c conda-forge pyomo
conda install -c conda-forge/label/cf201901 pyomo
enter y when it asks for yes/no, wait for the process to end and you’re away.
The conda install command in the prompt is the way to install any Python package you don’t already have installed (the other major way is with pip). For example, another useful package not native to Anaconda is pymysql, which allows you to read data directly from SQL databases. To install this you would input into the command prompt:
conda install -c anaconda pymysql
Pyomo requires a (usually mixed-integer linear programming) solver program to work. You can spend tens of thousands of dollars on commercial programs if you’ve got very large problems to solve, but for now we’ll use the open source GLPK solver.
To download GLPK , first go to the GLPK windows download page and then follow the instructions here:
- Go to control panel to determine whether you have 32-bit or 64-bit Windows (assume 64-bit from now on).
- Download the latest version of GLPK, 4.65 at the time of writing, from the following address https://sourceforge.net/projects/winglpk/.
- Extract the Zip folder by: right clicking on the folder and then>> 7-Zip >> Extract Here as shown.
- move the glpk-4.65 folder from your downloads folder to your C: drive.
- Assuming you’re using 64-bit Windows, click on the C:\glpk-4.65 folder in Windows explorer, click on the w64 folder, and select and copy the file path, which should be C:\glpk-4.65\w64.
- Search and open your Control Panel, select System and Security>>System>>Advanced system settings>>Environment Variables. Then click on ‘path’ in the top window, click the ‘Edit’ button, then ‘New’.
- Paste the file path you copied above and save.
That’s it. You now have the exact same quantitative analytical capacity as Practical Economics.
Obviously, it’s not magic just to download the programs and you still have to write and run intelligent code. You still need to do the work, but you’ve now got access to a tool that will elevate your work to a new level.
Next up we’ll do a post on the (very) basics of running a Python command or script. Past that, there is so much information already available on the web that we’d only be duplicating, probably in an inferior way.
Good luck.