Back to: Data Science Tutorials
Introduction to Python Programming for Data Science
In this article, I am going to give you a brief introduction to Python Programming for Data Science. At the end of this article, you will understand what is python, how to use python, and some basics concepts of python with examples. So, at the end of this article, you will understand the following pointers.
- Python Overview
- About Interpreted Languages
- Advantages/Disadvantages of Python
- Starting Python
- Interpreter PATH
- Using the Interpreter
- Running a Python Script
- Using Variables
- Keywords
- Built-in Functions
- Strings
- Different Literals Math Operators and Expressions
- Writing to the Screen
- String Formatting
- Command Line Parameters and Flow Control
Python Overview
Python is a scripting language that is high-level, interpreted, interactive, and object-oriented. Python is intended to be a very understandable language. It typically uses English terms instead of punctuation, and it has fewer syntactical structures than other languages.
Guido van Rossum created Python in the Netherlands’ National Research Institute for Mathematics and Computer Science in the late 1980s and early 1990s. Python is based on a variety of languages, including ABC, Modula-3, C, C++, Algol-68, SmallTalk, Unix shell, and other scripting languages.
It has the following characteristics –
- Python is an Interpreted language which means you don’t require to assemble or compile it before running.
- Python is interactive in the sense that you can sit at a Python prompt and interact directly with the interpreter to write your program. Python is an object-oriented programming language.
- The Object-Oriented programming style or paradigm, which encapsulates code inside objects, is supported by Python.
Features in Python –
- Python is easy to learn
- Python code is easy to read and debug as well
- It has plenty of libraries for various operations
- It is a cross-platform language
- It can be used for both backend and frontend development
- It provides support for scaling large programs as well
Interpreted Language
An interpreted language is a type of computer language that is interpreted rather than being compiled into machine instructions. It’s one in which the instructions are read and performed by another program rather than being directly executed by the target computer. Languages that can be interpreted include JavaScript, Perl, Python, BASIC, and others.
Features of Interpreted Languages –
- The instructions in this language are not directly executed by the target computer.
- From source code to execution, there is only one step.
- Programs that are interpreted can be changed while they are running.
- All debugging is done at runtime in this language.
- This language produces better results.
- This language has a slower performance than others.
- JavaScript, Perl, Python, BASIC, and other interpreted languages are examples.
For program execution, an interpreter often employs one of the following strategies:
- Parse the source code and carry out the desired behavior;
- Convert source code to an efficient intermediate representation or object code and run it right away;
- Explicitly run precompiled code that was created by a compiler that is part of the interpreter system.
Applications –
- Because each operator in a command language is usually an invocation to a sophisticated process such as an editor or compiler, interpreters are frequently employed to run command and glue languages.
- A virtual machine can run machine code designed for a specific hardware architecture. This is frequently used when the desired architecture is unavailable, or for executing multiple copies, among other things.
- In an interpreted language, self-modifying code is simple to implement. This has something to do with the origins of interpretation in Lisp and AI research.
Advantages/Disadvantages of Python
Advantages –
- It is simple to read, understand, and write. Python is a high-level programming language with a syntax that is similar to English. This is the reason why Python code is highly readable and easy to debug.
- Python is really simple to pick up and learn, which is why many people recommend it to newcomers. When compared to other prominent languages like C/C++ and Java, you require fewer lines of code to accomplish the same purpose.
- Python is an extremely useful programming language. Python’s simplicity allows developers to concentrate on the subject at hand. You write less code and accomplish more.
- Python is an interpreted language, which means that the code is executed line by line by Python. In the event of an error, it halts the program’s execution and reports the error. Even if the program has several faults, Python only displays one. This facilitates debugging.
- Until we run the code, Python has no idea what type of variable we’re dealing with. During execution, it assigns the data type automatically. The programmer is not required to declare variables or their data types.
- Python is released under an open-source license that has been authorized by the OSI. As a result, it is both free to use and distribute. You can get the source code, change it, and even share your own Python version.
Disadvantages –
- Python is an interpreted and dynamically typed language, as we mentioned earlier. Code execution that is carried out line by line is notoriously slow.
- Python’s poor pace is due to its dynamic nature, which requires it to perform additional work while executing code. As a result, Python isn’t recommended for projects where speed is critical.
- Python must make a tradeoff in order to provide developer simplicity. The Python programming language consumes a lot of RAM. When we desire memory optimization in our applications, this can be negative.
- Mobile Computing Is Weak
- Python is a popular language for server-side development. Python is not used on the client-side or in mobile applications for the following reasons. Python uses a lot of memory and has a slow processing speed when compared to other languages.
- Python programming is simple and stress-free. However, it doesn’t work well while interacting with the database. In comparison to popular technologies like JDBC and ODBC, Python’s database access layer is rudimentary and immature.
Starting Python
Here’s how to set up Python on your computer and execute it.
- Install the most recent version of Python.
- Run the installer and follow the on-screen instructions to install Python.
- Check the box to add Python to environment variables during the installation process. Python will be added to the environment variables, and you will be able to run it from wherever on the machine.
To create a Python script file, we can use any text editing software.
- All we have to do now is save it with the.py extension. Using an IDE, on the other hand, can make our lives much easier. For application development, an IDE is a piece of software that gives beneficial capabilities such as code suggestions, syntax highlighting and checking, file explorers, and so on to the programmer.
- By the way, when you install Python, it comes with an IDE called IDLE. It will allow you to run Python on your PC. For novices, it’s a good IDE.
- When you launch IDLE, it launches an interactive Python Shell.
Interpreter Path
To check the full path of the current Python interpreter, you can try running this command.
Using the Interpreter
A program that runs other programs is known as an interpreter. When you build Python applications, the language turns the developer’s source code into an intermediate language, which is then translated into the native language/machine language that is executed.
Python code is compiled into byte code, which results in a file with the extension – (.pyc). Internal byte code compilation took place, virtually fully hidden from the developer. The compilation is merely a translation process, and byte code is a platform-independent, lower-level representation of your source code. Each of your source statements is roughly converted to a set of byte code instructions. This byte code translation is done to speed up the execution of the code. Byte code can be executed much faster than the original source code statements.
The .pyc file, which was created during the compilation process, is then executed by virtual machines. The Virtual Machine is nothing more than a huge loop that iterates over your byte code instructions one by one, performing their operations. The Virtual Machine is Python’s runtime engine, and it is the component that actually runs Python scripts. It’s merely the last step of the Python interpreter in technical terms.
Running a Python Script
A multi-step procedure begins when you try to run Python scripts. The interpreter will do the following tasks during this process:
- In a sequential manner, process the statements in your script.
- Compile the original code to bytecode, an intermediate format.
This bytecode is a platform-independent translation of the code into a lower-level language. Its goal is to improve code execution. As a result, the interpreter will skip the compilation phase the next time it runs your code.
- At this moment, something called a Python Virtual Machine (PVM) kicks in. Python’s runtime engine is called PVM. It’s a loop that iterates through your bytecode’s instructions, running them one by one.
The Python Execution Model describes the entire process of running Python scripts. A python interpreter must be downloaded and installed in order to run a python script. Here’s a quick Python script that prints “Hello World!”:
print(“Hello World!”)
The ‘print()’ function is used to print any text included in parenthesis. If you’re coming from another language, you’ll notice there’s no semicolon at the conclusion of the statement since Python doesn’t need you to declare the line’s end. Also, to run a simple python script, we don’t need to include or import any files.
There are several ways to launch a Python script, but before we get into the various options, we must first determine whether or not a Python interpreter is installed on the machine. So open ‘cmd’ (Command Prompt) in Windows and type the following command.
python -V
The version number of the Python interpreter installed will be displayed, or an error will be displayed if it is not. Python scripts can be run in a variety of ways. Here are some options for running a Python script.
- Command Line
- Text Editor (ex. VS Code)
- Interactive Mode
- IDE (ex. Pycharm, IDLE)
Using Variables in Python
A variable is a named location in memory where data is stored. Variables can be thought of as a container for data that can be altered later in the program.
As an example, we’ve established a variable named age in this case. The variable has been given the value of 23.
age = 23
Variables can be compared to a bag in which books can be stored and replaced at any time.
age = 22
age = 23
The initial value of the variable age was 23. It was afterward updated to 22.
Declaration and Assignment in Python-
As you can see from the preceding example, you can use the assignment operator = to assign a value to a variable in Python.
Example: Declaring a variable and assigning a value to it
The variable greet was assigned value hello in the above program. The value assigned to the greet, i.e. hello, was then printed out.
Variables are written according to a given set of rules –
- Lowercase (a to z) or uppercase (A to Z) characters, numerals (0 to 9), or an underscore (_) can all be used as identifiers. Valid examples are myClass, var 1, and print this to the screen.
- A digit cannot be the first character in an identifier. Variable1 is an acceptable name, but 1variable is not.
- The usage of keywords as identifiers is not permitted.
- In our identifier, we can’t utilize special characters like!, @, #, $, percent, and so on.
- The length of an identifier is entirely up to you.
Keywords in Python
In Python, reserved words are known as keywords. A keyword cannot be used as a variable, function, or other identifier. They hold specific meanings and are used in Python code to carry out specific tasks. Keywords in Python are case-sensitive. In Python 3.7, there are 33 keywords.
Strings in Python
Strings can be defined as a sequence of characters. A succession of characters is referred to as a string. A character is nothing more than a symbol. The English language, for example, has 26 characters.
Numbers are what computers work with, not characters (binary). Although you may see characters on your screen, they are stored and modified inside as a series of 0s and 1s.
Strings are formed by enclosing characters within single or double quotations. In Python, triple quotes can be used to represent multiline strings and docstrings, but they are most commonly used to represent multiline strings and docstrings.
Example –
# defining strings in Python # all of the following are equivalent my_string = 'Hello' print(my_string) my_string = "Hello" print(my_string) my_string = '''Hello''' print(my_string) # triple quotes string can extend multiple lines my_string = """Hello, welcome to the world of Python""" print(my_string)
Access Characters of a String in Python
Indexing allows us to access individual characters, while slicing allows us to access a group of characters. The index begins at zero. If you try to access a character outside of the index range, you’ll get an IndexError. An integer must be used as the index. We can’t use floating or other kinds because TypeError will occur.
Python sequences support negative indexing. The last item is represented by the index -1, the second last item by the index -2, and so on. Using the slicing operator, we can get a list of things in a string: (colon).
Example:
#Accessing string characters in Python str = 'dotnet' print('str = ', str) #first character print('str[0] = ', str[0]) #last character print('str[-1] = ', str[-1]) #slicing 2nd to 4th character print('str[1:4] = ', str[1:4]) #slicing 2nd to 2nd last character print('str[1:-2] = ', str[1:-2])
Output:
String Operations in Python
Strings are one of the most commonly used data types in Python since they can execute a wide range of operations.
String Concatenation
Concatenation is the process of joining two or more strings into a single one. In Python, the + operator accomplishes the addition of strings. Concatenating two string literals is as simple as writing them together. The * operator can be used to make a string repeat for a specified number of times.
Different Literals Math Operators and Expressions
Literals –
Literals are a type of notation used in source code to represent a fixed value. They can alternatively be characterized as unprocessed data or raw values included in variables or constants.
Example –
# Numeric literals x = 24 y = 24.3 z = 2+3j print(x, y, z)
Here 42, 42.5, and 4+2j will be considered as literals.
Different types of Literals in Python –
- String literals
- Numeric literals
- Boolean literals
- Literal Collections
- Special literals
Operators in Python:
In Python, operators are special symbols that perform arithmetic or logical computations. The operand is the value on which the operator operates.
Arithmetic Operators in Python–
These operators are used for basic mathematical operations.
Conditional Operators in Python–
These operators are used for checking conditions based on the comparison of values. The value returned is either True or False.
Logical Operators in Python–
Assignment Operator in Python–
To assign values to variables in Python, assignment operators are used. The assignment operator a = 1 assigns the value 1 on the right to the left-hand variable a.
Expressions –
An expression is a combination of values, variables, operators, and calls to functions. Expressions need to be evaluated. If you ask Python to print an expression, the interpreter evaluates the expression and displays the result.
Example – 4*2/7+8 = 9
Writing to the Screen in Python
For writing anything on your screen using Python or to display any formatted message, you will use the print() function. Inside parentheses, you can pass arguments as per your wish. If you don’t pass any argument, then an empty line will get printed. You can also produce a blank line or new line on your screen by using the newline character – ‘\n’.
print(‘\n’)
You can print a statement on your screen by passing it as an argument inside the print function. Example –
print(“Welcome to the world of coding!”)
You can print multiple items as well, by passing them as arguments.
print(“Hi! I am”, 23, “year old!”)
There is another argument – separator. It is used for specifying the value for joining elements. By default, the value is a single space.
print(“Hello”, “World”, sep=”\n”)
String Formatting in Python
There are multiple ways of string formatting while generating an output string in Python.
Using % Operator
This operator is known as a format specifier tells you where to substitute the value/name represented as a string. Example –
print(“Hello %s, it is a %s day!” % (name, weather)
Using str.format
This can be used for positional formatting. First, you need to use placeholders ‘{}’. In this case, the order of arguments matters a lot. Example
print(“Hello {}, it is a {} day!”.format(name, weather))
Using f-strings
This is a new approach termed formatted string literals. A Python parser feature that turns f-strings into a set of string constants and expressions is known as formatted string literals. They’re then strung together to form the final string. Example –
print(f”Hello, {name}, it is a {weather} day!”)
Command Line Parameters and Flow Control in Python
Command Line Parameters are the arguments that are presented after the program name in the operating system’s command-line shell. Python has several methods for dealing with these types of parameters. The following are the three most common:
- getopt module
- argparse module
- sys.argv
Flow Control in Python–
The control flow of a program is the sequence in which its code is executed. Conditional statements, loops, and function calls govern the control flow of a Python program.
- Sequential is the default mode;
- Selection is used for decisions and branching;
- and Repetition is used for looping or repeating a piece of code several times.
Sequential –
Sequential statements are a group of statements that are executed in a specific order. The difficulty with sequential statements is that if the reasoning in any of the lines fails, the entire source code will fail to execute.
Decision Control Statements –
In Python, selection statements are also known as branching statements or decision control statements. The selection statement allows a computer to test multiple conditions and execute instructions depending on which one is true.
- Simple if
- if-else
- nested if
- if-elif-else
Repetition Statements –
A repetition statement is used to repeat a set of programming instructions (or a block of instructions). We have two loops/repetitive statements in Python: for loop and while loop.
In the next article, I am going to discuss Data Structures in Python for Data Science. Here, in this article, I try to give an overview of Introduction to Python Programming for Data Science. I hope you enjoy this Python Programming for Data Science article.