Let’s discuss why Excel Users should learn SQL to pull their own data.
First, we will answer the basic question, What is SQL?
SQL stands for Structured Query language. It is used to manage structured information in database tables. Structured data is formatted for meaningful data retrieved to make informed business decisions.
SQL allows an organization to Create, Manage and Retrieve from its structured data.
So Why should you, a data professional, learn SQL?
On our path to answering this important question, let’s talk about the pitfalls with how you currently get your data.
Without database knowledge and access, you are reliant on another medium to deliver data for analysis. Your organization’s data is already stored in a database somewhere, but if you are not using SQL there’s a middleman or middleware involved in the situation.
This medium may be in the form of a reporting tool, an established data package or feeds from another SQL professional. Each of these options carries limitations.
Reporting Tool– Reporting software forces you to use the interface of another program. The learning curve of each reporting tool varies but you are always dependent on it.
This subjects you to upgrade pains or an outright switch to another tool at Management’s whim. If the tool permits you to make your own reports, they still run slower than direct SQL and many 3rd party reporting tools limit your maximum query pulling potential.
Instead, learn SQL and benefit from its flexibility and portability.
Predetermined Data Packages – Companies invest major resources into data warehouses or specialized databases for reporting. Aside from the limitations of an interface needed to access these, the biggest issue is the configuration of the data packages.
The person setting up the data or the Vendor of the data package determines the relationship of the tables for reporting and set constraints. The constraints are sometimes appropriate, as in the case of preventing a user from conflating a USER table and an EMPLOYEE table because he thinks the data is similar.
However, there are cases where a data warehouse manager makes tables incompatible, that had the potential to source valuable data when joined and extracted.
Data feeds from Another DB professional – This requires human intervention and unfortunately, other humans let us down(hopefully not always). Even if your Database Administrator or other DB contact is super reliable, they aren’t immune from schedule overload or sickness. Maybe you have the rare DBA who takes a vacation! How dare he?!
There’s also the concern around their knowledge depth of the table’s subject matter. Most DBAs work with a variety of tables and their subject matter of your tables may vary.
If your very knowledgeable DBA leaves the company, will her backfill be ready to help? You need to get some read-only access to the SQL source data so that your data connection(you!) is always knowledgeable and reliable.
Back to our original question, why should an Excel Expert invest more time to get proficient with SQL?
Hopefully, by now, I’ve hammered the point that we want to reduce the dependency on other mediums between us and the data.
Aside from eliminating dependencies, here are some other reasons you should be learning SQL:
Boosts your hiring prospects – Technology skills are in demand and tacking on SQL proficiency will greatly improve your standing as a data professional. Although a plus to your resume, your real-life value to a company also gets a boost. Data is referred to as a company’s lifeblood, a trend that will stick as the larger players seek an edge in every aspect of their Decision making.
It will improve the way you view data – Getting the data yourself will make you a more complete analyst. View the data each time for what it is, at its source rather than a spoon-fed package that appears when you need it(or a couple of days late if you depend on the wrong people). As you practice join tables, you will develop a greater understanding of how they interact with the rest of the tables and with the business as a whole.
The range of questions you can answer will grow immensely – Basically, start with any question and enhance your knowledge. Your questions won’t be limited by data packages but driven by your own curiosity, imagination, and SQL writing skill. Add new fields and joins without bugging anyone or logging into a clunky interface. With a higher view of the data, you determine where the adventure begins.
Excel has its limitations and you will stop breaking it(maybe) – Although Excel has a large limit on rows (1,048,576 at the time of this writing), no one said you have to fill every single one. If you are regularly working with huge datasets, use SQL to break down the dataset to a more manageable size before moving it to Excel.