Back to: Data Science Tutorials
Debugging, Databases, and Project Skeletons in Python for Data Science
In this article, I am going to discuss Debugging, Databases, and Project Skeletons in Python for Data Science with Examples. Please read our previous article where we discussed Object-Oriented Programming in Python for Data Science with Examples. At the end of this article, you will understand the following pointers in detail.
Debugging in Python
Debugging code is one of the most important, yet time-consuming, tasks for any developer to perform. When your code behaves strangely, crashes, or simply returns incorrect results, it is likely that your code contains a bug that caused those errors. The two main types of programming errors that can occur in any code written in any programming language are:
- Syntactic Errors: Syntactic errors occur when a command is mistyped, a variable or function is used without being defined, or the indentations in your code are incorrect. Most syntactic errors are simple to fix if you follow the Python Traceback instructions.
- Semantic Errors: On the other hand, there are what is known as semantic errors. These are errors that occur when your code runs but produces incorrect results, or when your code behaves unexpectedly.
Debugging is the only way to find the bug in your code and fix it to eliminate the error. I know that the first thing most of us do when confronted with a bug is to use a slew of print statements to track the execution of the code and pinpoint the location of the error. That may be a viable approach if your code is only a few lines or a few hundred, but as your codebase grows larger, that approach becomes less viable, and you will need to use something else.
Fortunately, Python is one of the most widely used programming languages. As a result, it has a plethora of tools for debugging your code that is far more efficient and feasible than inserting a print statement after every couple of code lines.
Python Standard Debugger (PDB)
PDB is a default debugger that comes with all versions of Python, so there is no need for installation or setup; you can simply start using it if you have any Python version installed on your machine.
The PDB is a command-line debugger that allows you to insert breakpoints in your code and then run it in debug mode. You can inspect your code and stack frames using these breakpoints, which is similar to using the print statement. You can begin using the PDB by importing it at the start of your code.
PDB can be used to skip lines of code or iterate over a loop for a set number of times. Because the debugger is implemented as a class in the Python standard library, you can extend it as needed. PDB is a very basic debugger, but various extensions, such as RPDB and PDB++, can improve the debugging experience if you’re working with IPython.
PyCharm Debugger in Python
The PDB is a command-line debugger that not everyone finds interesting or easy to use. That is one of the reasons why IDEs are developed (Integrated Development Environments). IDEs allow you to visually debug and test your code, making debugging any size codebase easier and more efficient. However, these IDES are frequently large in size and require additional installation.
PyCharm, a Python-specific IDE developed by JetBrains, is the most well-known Python IDE. PyCharm is more than just a debugging tool; it is a complete development environment. The PyCharm interface is not difficult to use in my opinion, but it does take some getting used to, especially if you have never used an IDE before.
PyCharm’s debugging tool employs dialogue boxes to guide you through the code execution process and to allow you to select various debugging parameters. When using PyCharm’s debugging mode, you have the option of inserting breakpoints on specific code lines or having exception breakpoints (which the debugger sets if that specific exception is met).
Komodo Debugger in Python
If you’re creating a multi-language codebase, including Python, and need a powerful debugging environment that can handle a variety of syntax, Komodo should be your first choice. ActiveState’s Komodo is a full-featured IDE designed and developed for mixed-language applications.
Dialog boxes are used by the Komodo debugger to obtain debugger options from you. Furthermore, if you select the default debugger setting, it will run without any further prompts. Komodo has a sophisticated and adaptable method for detecting different programming languages, and it can even handle multiple languages within the same code file.
Komodo also provides various visualization options within the debugger mode, allowing you to gain a better understanding of the codebase. Furthermore, it facilitates unit testing and supports live peer viewing and team collaboration. Git can also be integrated with Komodo for real-time version control.
Dealing with Errors in Python
Errors are issues in a program that causes the program to halt its execution. Exceptions, on the other hand, are raised when internal events occur that disrupt the program’s usual flow. Python has two sorts of errors.
- Syntax errors
- Logical errors
Syntax Errors –
A syntax error is produced when the language’s proper syntax is not followed. Example
# check if a person is adult or not age = int(input("Enter age: ")) if age>=18 print("You are an adult")
We can see here that the syntax error has occurred because the if statement has a colon at the end. We may correct this by using the proper syntax.
Logical Errors –
When an error happens after passing the syntax test, it is referred to as an exception or logical type. When we divide any number by zero, for example, we get the ZeroDivisionError exception, and when we import a module that doesn’t exist, we get the ImportError exception. Example –
# set the value of a number # then find the value of a number divided by 0 number = int(input("Enter a number: ")) result = number/0 print(result)
We got a ZeroDivisionError in the previous example because we tried to divide a number by 0.
How to handle errors in Python?
When an error or an exception occurs, we use the Handling method to manage it.
Exception Handling with Try/Except/Finally –
The Try/Except/Finally technique can be used to manage errors. In the try, except, and finally blocks, we write unsafe code, fallback code, and final code. Example –
# Exception Handling def calc(x, y): try: result = x // y except ZeroDivisionError: print("Sorry ! You are dividing by zero ") else: print("Yeah ! Your answer is :", result) finally: # this block is always executed # regardless of exception generation. print('This is always executed') calc(3, 2) calc(3, 0)
Exceptions are raised for a preset condition –
We can raise an exception when we want to code for the limiting of particular conditions. Example –
# try for unsafe code try: age = 17 if age < 18: # raise the ValueError raise ValueError("You're not an adult!") else: print("Please cast your vote!") # if false then raise the value error except ValueError as e: print(e)
Using Unit Tests in Python
The smallest testable elements of the software are tested in unit testing, which is the first level of software testing. This is used to ensure that each component of the software works as intended.
- The unit test framework is based on Python’s xUnit approach.
- The unit test framework supports the following OOP concepts:
A test fixture is used as a baseline for performing tests to ensure that they are conducted in a consistent environment with repeatable results. Consider the following examples:
- putting together temporary databases
- I’m going to start a server procedure.
A test case is a set of situations used to determine whether or not a system is working properly.
A test suite is a collection of test cases that are used to demonstrate that a software program has a specific set of behaviors by combining the tests.
A test runner is a component that sets up test running and displays the results to the user.
Let’s have a look at how we can run test –
import unittest class SimpleTest(unittest.TestCase): # Returns True or False. def test(self): self.assertTrue(True) if __name__ == '__main__': unittest.main()
Possible Outcomes Include:
There are three types of test results that could occur:
- OK – This indicates that all of the tests have been completed successfully.
- FAILED – This indicates that the test failed and an AssertionError exception was thrown.
- ERROR – This indicates that the test throws a different exception than AssertionError.
Project Skeleton in Python
Skeleton code is a term that refers to a project’s basic layout that does not include any data but is more than a blank template. A new install of the WordPress foundation, for example, has no user data, posts, pages, or customized settings. It’s just a skeleton for developing your site.
By “structure,” we mean the decisions you make about how your project will best achieve its goal. We must evaluate how to make the most of Python’s features in order to write clean, effective code. In practice, “structure” entails writing clean code with clear logic and dependencies, as well as organizing files and directories in the filesystem.
You can use templated project skeletons then customize and generate with one command. Requirements for Python are as follows:
- Packaging and inclusion of “modern” setup tools
- Capable of creating CLI entry points (for command-line tools; this should be able to be skipped when generating a library)
- Fixtures, parametrize(), and other pytest integration examples.
- From the start, the created project should be able to be built, tested, and have a good Python style.
- README.md contains instructions that are both legitimate and repeatable.
- Setuptools_scm allows for versioning based on version control. Instead of synchronizing a tag and a version string in the repo, this makes it simple to cut versions using tags.
Project Directory in Python
When you get the path of the root project structure, you’ll get a string with the absolute path of the current project’s root. Use the function os.path.dirname(path), where the path is the path to any file in the project’s top level. The outer folder, which contains all other project files, is at the top level of the project. This path can be found by calling os.path.abspath(file), where a file is a file in the project’s top level. Example –
import os ROOT_DIR = os.path.dirname(os.path.abspath("sample_doc.txt")) print(ROOT_DIR)
Creating a Database with SQLite
To create a new SQLite database, use the sqlite3 command in SQLite. To build a database, you don’t need any specific permissions. The fundamental syntax for the sqlite3 command to build a database is as follows: −
If you wish to create a new database called test_data.db, use the SQLITE3 statement:
In the current directory, the command above will create the file testDB DB. The SQLite engine will utilize this file as a database. Once a database has been created, use the SQLite databases command to validate it in the database list.
CRUD Operations in Python
The abbreviation CRUD stands for CREATE, READ, UPDATE, and DELETE in computer programming. The four basic roles of persistent storage are as follows. In addition, each letter of the acronym can refer to any function in a relational database application that is mapped to a typical HTTP method, SQL statement, or DDS action.
It can also refer to user-interface principles that let users browse, search, and alter data using computer-based forms and reports. Entities are read, created, changed, and removed in this way. Those same entities can be changed by collecting data from a service and updating setting properties before returning the data to the service for an update. Furthermore, CRUD is data-driven, and HTTP action verbs are standardized.
- CREATE procedures: Inserts a new record using the INSERT command.
- READ methods read table records based on the primary key in the input parameter.
- UPDATE procedures execute an UPDATE statement on the table based on the primary key supplied for a record in the WHERE clause.
- DELETE procedures – delete a specific record in the WHERE clause.
When it comes to performing CRUD activities, an application designer has a lot of possibilities. To conduct processes, one of the most efficient options is to establish a collection of stored procedures in SQL. The following are some popular naming conventions for CRUD stored procedures:
- The procedure name should conclude with the CRUD operation’s implementation name. The prefix for user-defined stored procedures should not be the same as the prefix for other user-defined stored procedures.
- If you put the table name after the prefix, CRUD methods for the same table will be grouped together.
- After you’ve added CRUD procedures, you can edit the database schema by determining which database entity will be used for CRUD activities.
Creating a Database Object
Any defined object in a database that is used to store or reference data is referred to as a database object. Database Objects are whatever we create with the create command. It has the ability to store and manipulate data. Views, sequences, and indexes are some examples of database objects.
- Table – The most basic unit of storage; it is made up of rows and columns.
- View – Subsets of data from one or more tables are logically represented by a view.
- Sequence – A function that generates main key values.
- Index – Enhances the speed of some queries.
- Synonym – A different name for the same thing.
Examples of creating some of the database objects are mentioned below –
Example 1 – Creating a Table
CREATE TABLE student (rollno NUMBER(2), fname VARCHAR2(14), major VARCHAR2(13));
Example 2 – Creating a View
CREATE VIEW details AS SELECT rollno RNO, fname NAME, marks Total FROM student WHERE marks > 40;
Example 3 – Creating an index
CREATE INDEX student_idx ON student(fname);
In the next article, I am going to discuss Functions in Python for Data Science with Examples. Here, in this article, I try to explain Debugging, Databases, and Project Skeletons in Python for Data Science with Examples. I hope you enjoy this Debugging, Databases, and Project Skeletons in Python for Data Science article.