Macquarie University
01whole.pdf (6.02 MB)

Towards evaluating Python as a suitable data science programming language for modern computing architecture

Download (6.02 MB)
posted on 2023-11-10, 02:09 authored by Anthony Peter James Shaw

The two most popular Computer Programming languages for Data Science are Python, and R. Both are dynamically typed, interpreted languages. Python first appeared in 1991, and R in 1993. Thirty years later, computing architecture has moved toward parallel computation to compensate for decreased annual performance improvement for CPUs. The performance improvement rates for the period 1986-2003 were 52% per year then declining to 22% from 2003-2015. Modern CPU architecture features multiple cores that can parallelize execution even further with technologies such as Hyper-Threading. Despite this trend, Python and R are tied to their 1990s architectural design and are written with a single-threaded Global Interpreter Lock (Python) or a single-threaded interpreter (R). Furthermore, advancing massively parallel AI and Machine Learning algorithms requires support for implementations in code to leverage hardware with over 100 cores using GPU programming APIs. This research will explore the limitations of dynamically typed, interpreted languages on Data Science engineering. This research will demonstrate what factors are limiting the scalability of Python and what alternatives are being developed. This research will propose where investments in the industry should be placed to ensure long-term support for data mining, processing, and engineering applications. 


Table of Contents

1. Introduction -- 2. Background and state of the art -- 3. Methodology -- 4. Experiments and evaluation -- 5. Conclusion and future work -- A. Appendix - Benchmarks -- List of symbols -- Bibliography

Awarding Institution

Macquarie University

Degree Type

Thesis MRes


Master of Research

Department, Centre or School

School of Computing

Year of Award


Principal Supervisor

Amin Beheshti

Additional Supervisor 1

Xuyun Zhang


Copyright: The Author Copyright disclaimer:




104 pages

Former Identifiers

AMIS ID: 255013

Usage metrics

    Macquarie University Theses