Every few years, someone will point out that Excel remains “the world’s most popular BI tool.”
Yet, while Excel remain a popular choice for data manipulation, its limitations mean that it is no longer the best choice for modern applications
When first introduced, spreadsheets revolutionised data manipulation, offering simple, flexible options to ingest data and a relatively intuitive WYSIWYG computation model that allowed relatively IT illiterate users to build their own analytics.
What has changed?
Spreadsheets have been around for more than 30 years with very little change. The world, however, has changed substantially.
These changes include:
Volumes of data
Those of us that are longer in the tooth will remember when MS Excel increased the maximum number of rows from 65,000 to 1,000,000 – what a game changer that was. Yet, today, computational models may easily need to track 10s or even 100s of millions of records. Spreadsheets simply cannot cope with these volumes
Of course, they will can still be used as a sandbox environment but increasingly spreadsheets are loosing their lustre as a big data analytics option.
The lack of relationships
Spreadsheets inability to maintain many-to-many relationships also limits their usefulness.
Many data applications are intended to track complex relationships between data elements – for example, a “simple” application to track where personal information is stored within your business and who is responsible for it may very quickly out grow Excel.
Even if the data it self can be stored in the form of simple lists, the lack of workflow, approvals, security and many more features severely limits Excel’s suitability to most modern applications.
The previous example touched on regulatory concerns, such as privacy.
Increasingly, data sets must be both secure and trusted to meet regulatory goals. This means that data sources and lineage must be established and that the quality of both data and code must be verified – quite tough to do in Excel.
Whilst our partner, MANTA, is delivering lineage for Excel as one of the many data sources we track this will never provide the level of traceability of a full blown database.
MANTA can connect to Microsoft Excel, process .xlsm and .xlsx files, and look up Excel objects such as graphs and pivot tables. MANTA also recognizes some tables manually created in sheets. All the mapped objects are then connected with their source objects by analyzing queries with MANTA’s database connectors
Can spreadsheets still be a good fit?
As mentioned above, spreadsheets can still be useful, for example as scratchpad for testing assumptions, or to maintain simple lists at an early stage of an implementation.
In my experience, however, many so-called scratchpads often end up being pushed into production service, frequently without any kind of testing or management.