write a zip file in python

Python's zipfile: Manipulate Your ZIP Files Efficiently

Table of Contents

What Is a ZIP File?

Why use zip files, can python manipulate zip files, opening zip files for reading and writing, reading metadata from zip files, reading from and writing to member files, reading the content of member files as text, extracting member files from your zip archives, closing zip files after use, creating a zip file from multiple regular files, building a zip file from a directory, compressing files and directories, creating zip files sequentially, extracting files and directories, finding path in a zip file, building importable zip files with pyzipfile, running zipfile from your command line, using other libraries to manage zip files.

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Manipulating ZIP Files With Python

Python’s zipfile is a standard library module intended to manipulate ZIP files . This file format is a widely adopted industry standard when it comes to archiving and compressing digital data. You can use it to package together several related files. It also allows you to reduce the size of your files and save disk space. Most importantly, it facilitates data exchange over computer networks.

Knowing how to create, read, write, populate, extract, and list ZIP files using the zipfile module is a useful skill to have as a Python developer or a DevOps engineer.

In this tutorial, you’ll learn how to:

Read, write, and extract files from ZIP files with Python’s zipfile
Read metadata about the content of ZIP files using zipfile
Use zipfile to manipulate member files in existing ZIP files
Create new ZIP files to archive and compress files

If you commonly deal with ZIP files, then this knowledge can help to streamline your workflow to process your files confidently.

To get the most out of this tutorial, you should know the basics of working with files , using the with statement , handling file system paths with pathlib , and working with classes and object-oriented programming .

To get the files and archives that you’ll use to code the examples in this tutorial, click the link below:

Get Materials: Click here to get a copy of the files and archives that you’ll use to run the examples in this zipfile tutorial.

Getting Started With ZIP Files

ZIP files are a well-known and popular tool in today’s digital world. These files are fairly popular and widely used for cross-platform data exchange over computer networks, notably the Internet.

You can use ZIP files for bundling regular files together into a single archive, compressing your data to save some disk space, distributing your digital products, and more. In this tutorial, you’ll learn how to manipulate ZIP files using Python’s zipfile module.

Because the terminology around ZIP files can be confusing at times, this tutorial will stick to the following conventions regarding terminology:

Term	Meaning
ZIP file, ZIP archive, or archive	A physical file that uses the
File	A regular
Member file	A file that is part of an existing ZIP file

Having these terms clear in your mind will help you avoid confusion while you read through the upcoming sections. Now you’re ready to continue learning how to manipulate ZIP files efficiently in your Python code!

You’ve probably already encountered and worked with ZIP files. Yes, those with the .zip file extension are everywhere! ZIP files, also known as ZIP archives , are files that use the ZIP file format .

PKWARE is the company that created and first implemented this file format. The company put together and maintains the current format specification , which is publicly available and allows the creation of products, programs, and processes that read and write files using the ZIP file format.

The ZIP file format is a cross-platform, interoperable file storage and transfer format. It combines lossless data compression , file management, and data encryption .

Data compression isn’t a requirement for an archive to be considered a ZIP file. So you can have compressed or uncompressed member files in your ZIP archives. The ZIP file format supports several compression algorithms, though Deflate is the most common. The format also supports information integrity checks with CRC32 .

Even though there are other similar archiving formats, such as RAR and TAR files, the ZIP file format has quickly become a common standard for efficient data storage and for data exchange over computer networks.

ZIP files are everywhere. For example, office suites such as Microsoft Office and Libre Office rely on the ZIP file format as their document container file . This means that .docx , .xlsx , .pptx , .odt , .ods , and .odp files are actually ZIP archives containing several files and folders that make up each document. Other common files that use the ZIP format include .jar , .war , and .epub files.

You may be familiar with GitHub , which provides web hosting for software development and version control using Git . GitHub uses ZIP files to package software projects when you download them to your local computer. For example, you can download the exercise solutions for Python Basics: A Practical Introduction to Python 3 book in a ZIP file, or you can download any other project of your choice.

ZIP files allow you to aggregate, compress, and encrypt files into a single interoperable and portable container. You can stream ZIP files, split them into segments, make them self-extracting, and more.

Knowing how to create, read, write, and extract ZIP files can be a useful skill for developers and professionals who work with computers and digital information. Among other benefits, ZIP files allow you to:

Reduce the size of files and their storage requirements without losing information
Improve transfer speed over the network due to reduced size and single-file transfer
Pack several related files together into a single archive for efficient management
Bundle your code into a single archive for distribution purposes
Secure your data by using encryption , which is a common requirement nowadays
Guarantee the integrity of your information to avoid accidental and malicious changes to your data

These features make ZIP files a useful addition to your Python toolbox if you’re looking for a flexible, portable, and reliable way to archive your digital files.

Yes! Python has several tools that allow you to manipulate ZIP files. Some of these tools are available in the Python standard library . They include low-level libraries for compressing and decompressing data using specific compression algorithms, such as zlib , bz2 , lzma , and others .

Python also provides a high-level module called zipfile specifically designed to create, read, write, extract, and list the content of ZIP files. In this tutorial, you’ll learn about Python’s zipfile and how to use it effectively.

Manipulating Existing ZIP Files With Python’s zipfile

Python’s zipfile provides convenient classes and functions that allow you to create, read, write, extract, and list the content of your ZIP files. Here are some additional features that zipfile supports:

ZIP files greater than 4 GiB ( ZIP64 files )
Data decryption
Several compression algorithms, such as Deflate, Bzip2 , and LZMA
Information integrity checks with CRC32

Be aware that zipfile does have a few limitations. For example, the current data decryption feature can be pretty slow because it uses pure Python code. The module can’t handle the creation of encrypted ZIP files. Finally, the use of multi-disk ZIP files isn’t supported either. Despite these limitations, zipfile is still a great and useful tool. Keep reading to explore its capabilities.

In the zipfile module, you’ll find the ZipFile class. This class works pretty much like Python’s built-in open() function, allowing you to open your ZIP files using different modes. The read mode ( "r" ) is the default. You can also use the write ( "w" ), append ( "a" ), and exclusive ( "x" ) modes. You’ll learn more about each of these in a moment.

ZipFile implements the context manager protocol so that you can use the class in a with statement . This feature allows you to quickly open and work with a ZIP file without worrying about closing the file after you finish your work.

Before writing any code, make sure you have a copy of the files and archives that you’ll be using:

To get your working environment ready, place the downloaded resources into a directory called python-zipfile/ in your home folder. Once you have the files in the right place, move to the newly created directory and fire up a Python interactive session there.

To warm up, you’ll start by reading the ZIP file called sample.zip . To do that, you can use ZipFile in reading mode:

The first argument to the initializer of ZipFile can be a string representing the path to the ZIP file that you need to open. This argument can accept file-like and path-like objects too. In this example, you use a string-based path.

The second argument to ZipFile is a single-letter string representing the mode that you’ll use to open the file. As you learned at the beginning of this section, ZipFile can accept four possible modes, depending on your needs. The mode positional argument defaults to "r" , so you can get rid of it if you want to open the archive for reading only.

Inside the with statement, you call .printdir() on archive . The archive variable now holds the instance of ZipFile itself. This function provides a quick way to display the content of the underlying ZIP file on your screen. The function’s output has a user-friendly tabular format with three informative columns:

If you want to make sure that you’re targeting a valid ZIP file before you try to open it, then you can wrap ZipFile in a try … except statement and catch any BadZipFile exception:

The first example successfully opens sample.zip without raising a BadZipFile exception. That’s because sample.zip has a valid ZIP format. On the other hand, the second example doesn’t succeed in opening bad_sample.zip , because the file is not a valid ZIP file.

To check for a valid ZIP file, you can also use the is_zipfile() function:

In these examples, you use a conditional statement with is_zipfile() as a condition. This function takes a filename argument that holds the path to a ZIP file in your file system. This argument can accept string, file-like, or path-like objects. The function returns True if filename is a valid ZIP file. Otherwise, it returns False .

Now say you want to add hello.txt to a hello.zip archive using ZipFile . To do that, you can use the write mode ( "w" ). This mode opens a ZIP file for writing. If the target ZIP file exists, then the "w" mode truncates it and writes any new content you pass in.

Note: If you’re using ZipFile with existing files, then you should be careful with the "w" mode. You can truncate your ZIP file and lose all the original content.

If the target ZIP file doesn’t exist, then ZipFile creates it for you when you close the archive:

After running this code, you’ll have a hello.zip file in your python-zipfile/ directory. If you list the file content using .printdir() , then you’ll notice that hello.txt will be there. In this example, you call .write() on the ZipFile object. This method allows you to write member files into your ZIP archives. Note that the argument to .write() should be an existing file.

Note: ZipFile is smart enough to create a new archive when you use the class in writing mode and the target archive doesn’t exist. However, the class doesn’t create new directories in the path to the target ZIP file if those directories don’t already exist.

That explains why the following code won’t work:

Because the missing/ directory in the path to the target hello.zip file doesn’t exist, you get a FileNotFoundError exception.

The append mode ( "a" ) allows you to append new member files to an existing ZIP file. This mode doesn’t truncate the archive, so its original content is safe. If the target ZIP file doesn’t exist, then the "a" mode creates a new one for you and then appends any input files that you pass as an argument to .write() .

To try out the "a" mode, go ahead and add the new_hello.txt file to your newly created hello.zip archive:

Here, you use the append mode to add new_hello.txt to the hello.zip file. Then you run .printdir() to confirm that the new file is present in the ZIP file.

ZipFile also supports an exclusive mode ( "x" ). This mode allows you to exclusively create new ZIP files and write new member files into them. You’ll use the exclusive mode when you want to make a new ZIP file without overwriting an existing one. If the target file already exists, then you get FileExistsError .

Finally, if you create a ZIP file using the "w" , "a" , or "x" mode and then close the archive without adding any member files, then ZipFile creates an empty archive with the appropriate ZIP format.

You’ve already put .printdir() into action. It’s a useful method that you can use to list the content of your ZIP files quickly. Along with .printdir() , the ZipFile class provides several handy methods for extracting metadata from existing ZIP files.

Here’s a summary of those methods:

Method	Description
	Returns a object with information about the member file provided by . Note that must hold the path to the target file inside the underlying ZIP file.
	Returns a of objects, one per member file.
	Returns a list holding the names of all the member files in the underlying archive. The names in this list are valid arguments to .

With these three tools, you can retrieve a lot of useful information about the content of your ZIP files. For example, take a look at the following example, which uses .getinfo() :

As you learned in the table above, .getinfo() takes a member file as an argument and returns a ZipInfo object with information about it.

Note: ZipInfo isn’t intended to be instantiated directly. The .getinfo() and .infolist() methods return ZipInfo objects automatically when you call them. However, ZipInfo includes a class method called .from_file() , which allows you to instantiate the class explicitly if you ever need to do it.

ZipInfo objects have several attributes that allow you to retrieve valuable information about the target member file. For example, .file_size and .compress_size hold the size, in bytes, of the original and compressed files, respectively. The class also has some other useful attributes, such as .filename and .date_time , which return the filename and the last modification date.

Note: By default, ZipFile doesn’t compress the input files to add them to the final archive. That’s why the size and the compressed size are the same in the above example. You’ll learn more about this topic in the Compressing Files and Directories section below.

With .infolist() , you can extract information from all the files in a given archive. Here’s an example that uses this method to generate a minimal report with information about all the member files in your sample.zip archive:

The for loop iterates over the ZipInfo objects from .infolist() , retrieving the filename, the last modification date, the normal size, and the compressed size of each member file. In this example, you’ve used datetime to format the date in a human-readable way.

Note: The example above was adapted from zipfile — ZIP Archive Access .

If you just need to perform a quick check on a ZIP file and list the names of its member files, then you can use .namelist() :

Because the filenames in this output are valid arguments to .getinfo() , you can combine these two methods to retrieve information about selected member files only.

For example, you may have a ZIP file containing different types of member files ( .docx , .xlsx , .txt , and so on). Instead of getting the complete information with .infolist() , you just need to get the information about the .docx files. Then you can filter the files by their extension and call .getinfo() on your .docx files only. Go ahead and give it a try!

Sometimes you have a ZIP file and need to read the content of a given member file without extracting it. To do that, you can use .read() . This method takes a member file’s name and returns that file’s content as bytes :

To use .read() , you need to open the ZIP file for reading or appending. Note that .read() returns the content of the target file as a stream of bytes. In this example, you use .split() to split the stream into lines, using the line feed character "\n" as a separator. Because .split() is operating on a byte object, you need to add a leading b to the string used as an argument.

ZipFile.read() also accepts a second positional argument called pwd . This argument allows you to provide a password for reading encrypted files. To try this feature, you can rely on the sample_pwd.zip file that you downloaded with the material for this tutorial:

In the first example, you provide the password secret to read your encrypted file. The pwd argument accepts values of the bytes type. If you use .read() on an encrypted file without providing the required password, then you get a RuntimeError , as you can note in the second example.

Note: Python’s zipfile supports decryption. However, it doesn’t support the creation of encrypted ZIP files. That’s why you would need to use an external file archiver to encrypt your files.

Some popular file archivers include 7z and WinRAR for Windows, Ark and GNOME Archive Manager for Linux, and Archiver for macOS.

For large encrypted ZIP files, keep in mind that the decryption operation can be extremely slow because it’s implemented in pure Python. In such cases, consider using a specialized program to handle your archives instead of using zipfile .

If you regularly work with encrypted files, then you may want to avoid providing the decryption password every time you call .read() or another method that accepts a pwd argument. If that’s the case, you can use ZipFile.setpassword() to set a global password:

With .setpassword() , you just need to provide your password once. ZipFile uses that unique password for decrypting all the member files.

In contrast, if you have ZIP files with different passwords for individual member files, then you need to provide the specific password for each file using the pwd argument of .read() :

In this example, you use secret1 as a password to read hello.txt and secret2 to read lorem.md . A final detail to consider is that when you use the pwd argument, you’re overriding whatever archive-level password you may have set with .setpassword() .

Note: Calling .read() on a ZIP file that uses an unsupported compression method raises a NotImplementedError . You also get an error if the required compression module isn’t available in your Python installation.

If you’re looking for a more flexible way to read from member files and create and add new member files to an archive, then ZipFile.open() is for you. Like the built-in open() function, this method implements the context manager protocol, and therefore it supports the with statement:

In this example, you open hello.txt for reading. The first argument to .open() is name , indicating the member file that you want to open. The second argument is the mode, which defaults to "r" as usual. ZipFile.open() also accepts a pwd argument for opening encrypted files. This argument works the same as the equivalent pwd argument in .read() .

You can also use .open() with the "w" mode. This mode allows you to create a new member file, write content to it, and finally append the file to the underlying archive, which you should open in append mode:

In the first code snippet, you open sample.zip in append mode ( "a" ). Then you create new_hello.txt by calling .open() with the "w" mode. This function returns a file-like object that supports .write() , which allows you to write bytes into the newly created file.

Note: You need to supply a non-existing filename to .open() . If you use a filename that already exists in the underlying archive, then you’ll end up with a duplicated file and a UserWarning exception.

In this example, you write b'Hello, World!' into new_hello.txt . When the execution flow exits the inner with statement, Python writes the input bytes to the member file. When the outer with statement exits, Python writes new_hello.txt to the underlying ZIP file, sample.zip .

The second code snippet confirms that new_hello.txt is now a member file of sample.zip . A detail to notice in the output of this example is that .write() sets the Modified date of the newly added file to 1980-01-01 00:00:00 , which is a weird behavior that you should keep in mind when using this method.

As you learned in the above section, you can use the .read() and .write() methods to read from and write to member files without extracting them from the containing ZIP archive. Both of these methods work exclusively with bytes.

However, when you have a ZIP archive containing text files, you may want to read their content as text instead of as bytes. There are at least two way to do this. You can use:

bytes.decode()
io.TextIOWrapper

Because ZipFile.read() returns the content of the target member file as bytes, .decode() can operate on these bytes directly. The .decode() method decodes a bytes object into a string using a given character encoding format.

Here’s how you can use .decode() to read text from the hello.txt file in your sample.zip archive:

In this example, you read the content of hello.txt as bytes. Then you call .decode() to decode the bytes into a string using UTF-8 as encoding . To set the encoding argument, you use the "utf-8" string. However, you can use any other valid encoding, such as UTF-16 or cp1252 , which can be represented as case-insensitive strings. Note that "utf-8" is the default value of the encoding argument to .decode() .

It’s important to keep in mind that you need to know beforehand the character encoding format of any member file that you want to process using .decode() . If you use the wrong character encoding, then your code will fail to correctly decode the underlying bytes into text, and you can end up with a ton of indecipherable characters.

The second option for reading text out of a member file is to use an io.TextIOWrapper object, which provides a buffered text stream. This time you need to use .open() instead of .read() . Here’s an example of using io.TextIOWrapper to read the content of the hello.txt member file as a stream of text:

In the inner with statement in this example, you open the hello.txt member file from your sample.zip archive. Then you pass the resulting binary file-like object, hello , as an argument to io.TextIOWrapper . This creates a buffered text stream by decoding the content of hello using the UTF-8 character encoding format. As a result, you get a stream of text directly from your target member file.

Just like with .encode() , the io.TextIOWrapper class takes an encoding argument. You should always specify a value for this argument because the default text encoding depends on the system running the code and may not be the right value for the file that you’re trying to decode.

Extracting the content of a given archive is one of the most common operations that you’ll do on ZIP files. Depending on your needs, you may want to extract a single file at a time or all the files in one go.

ZipFile.extract() allows you to accomplish the first task. This method takes the name of a member file and extracts it to a given directory signaled by path . The destination path defaults to the current directory:

Now new_hello.txt will be in your output_dir/ directory. If the target filename already exists in the output directory, then .extract() overwrites it without asking for confirmation. If the output directory doesn’t exist, then .extract() creates it for you. Note that .extract() returns the path to the extracted file.

The name of the member file must be the file’s full name as returned by .namelist() . It can also be a ZipInfo object containing the file’s information.

You can also use .extract() with encrypted files. In that case, you need to provide the required pwd argument or set the archive-level password with .setpassword() .

When it comes to extracting all the member files from an archive, you can use .extractall() . As its name implies, this method extracts all the member files to a destination path, which is the current directory by default:

After running this code, all the current content of sample.zip will be in your output_dir/ directory. If you pass a non-existing directory to .extractall() , then this method automatically creates the directory. Finally, if any of the member files already exist in the destination directory, then .extractall() will overwrite them without asking for your confirmation, so be careful.

If you only need to extract some of the member files from a given archive, then you can use the members argument. This argument accepts a list of member files, which should be a subset of the whole list of files in the archive at hand. Finally, just like .extract() , the .extractall() method also accepts a pwd argument to extract encrypted files.

Sometimes, it’s convenient for you to open a given ZIP file without using a with statement. In those cases, you need to manually close the archive after use to complete any writing operations and to free the acquired resources.

To do that, you can call .close() on your ZipFile object:

The call to .close() closes archive for you. You must call .close() before exiting your program. Otherwise, some writing operations might not be executed. For example, if you open a ZIP file for appending ( "a" ) new member files, then you need to close the archive to write the files.

Creating, Populating, and Extracting Your Own ZIP Files

So far, you’ve learned how to work with existing ZIP files. You’ve learned to read, write, and append member files to them by using the different modes of ZipFile . You’ve also learned how to read relevant metadata and how to extract the content of a given ZIP file.

In this section, you’ll code a few practical examples that’ll help you learn how to create ZIP files from several input files and from an entire directory using zipfile and other Python tools. You’ll also learn how to use zipfile for file compression and more.

Sometimes you need to create a ZIP archive from several related files. This way, you can have all the files in a single container for distributing them over a computer network or sharing them with friends or colleagues. To this end, you can create a list of target files and write them into an archive using ZipFile and a loop:

Here, you create a ZipFile object with the desired archive name as its first argument. The "w" mode allows you to write member files into the final ZIP file.

The for loop iterates over your list of input files and writes them into the underlying ZIP file using .write() . Once the execution flow exits the with statement, ZipFile automatically closes the archive, saving the changes for you. Now you have a multiple_files.zip archive containing all the files from your original list of files.

Bundling the content of a directory into a single archive is another everyday use case for ZIP files. Python has several tools that you can use with zipfile to approach this task. For example, you can use pathlib to read the content of a given directory . With that information, you can create a container archive using ZipFile .

In the python-zipfile/ directory, you have a subdirectory called source_dir/ , with the following content:

In source_dir/ , you only have three regular files. Because the directory doesn’t contain subdirectories, you can use pathlib.Path.iterdir() to iterate over its content directly. With this idea in mind, here’s how you can build a ZIP file from the content of source_dir/ :

In this example, you create a pathlib.Path object from your source directory. The first with statement creates a ZipFile object ready for writing. Then the call to .iterdir() returns an iterator over the entries in the underlying directory.

Because you don’t have any subdirectories in source_dir/ , the .iterdir() function yields only files. The for loop iterates over the files and writes them into the archive.

In this case, you pass file_path.name to the second argument of .write() . This argument is called arcname and holds the name of the member file inside the resulting archive. All the examples that you’ve seen so far rely on the default value of arcname , which is the same filename you pass as the first argument to .write() .

If you don’t pass file_path.name to arcname , then your source directory will be at the root of your ZIP file, which can also be a valid result depending on your needs.

Now check out the root_dir/ folder in your working directory. In this case, you’ll find the following structure:

You have the usual files and a subdirectory with a single file in it. If you want to create a ZIP file with this same internal structure, then you need a tool that recursively iterates through the directory tree under root_dir/ .

Here’s how to zip a complete directory tree, like the one above, using zipfile along with Path.rglob() from the pathlib module:

In this example, you use Path.rglob() to recursively traverse the directory tree under root_dir/ . Then you write every file and subdirectory to the target ZIP archive.

This time, you use Path.relative_to() to get the relative path to each file and then pass the result to the second argument of .write() . This way, the resulting ZIP file ends up with the same internal structure as your original source directory. Again, you can get rid of this argument if you want your source directory to be at the root of your ZIP file.

If your files are taking up too much disk space, then you might consider compressing them. Python’s zipfile supports a few popular compression methods. However, the module doesn’t compress your files by default. If you want to make your files smaller, then you need to explicitly supply a compression method to ZipFile .

Typically, you’ll use the term stored to refer to member files written into a ZIP file without compression. That’s why the default compression method of ZipFile is called ZIP_STORED , which actually refers to uncompressed member files that are simply stored in the containing archive.

The compression method is the third argument to the initializer of ZipFile . If you want to compress your files while you write them into a ZIP archive, then you can set this argument to one of the following constants:

Constant	Compression Method	Required Module
	Deflate
	Bzip2
	LZMA

These are the compression methods that you can currently use with ZipFile . A different method will raise a NotImplementedError . There are no additional compression methods available to zipfile as of Python 3.10.

As an additional requirement, if you choose one of these methods, then the compression module that supports it must be available in your Python installation. Otherwise, you’ll get a RuntimeError exception, and your code will break.

Another relevant argument of ZipFile when it comes to compressing your files is compresslevel . This argument controls which compression level you use.

With the Deflate method, compresslevel can take integer numbers from 0 through 9 . With the Bzip2 method, you can pass integers from 1 through 9 . In both cases, when the compression level increases, you get higher compression and lower compression speed.

Note: Binary files, such as PNG, JPG, MP3, and the like, already use some kind of compression. As a result, adding them to a ZIP file may not make the data any smaller, because it’s already compressed to some level.

Now say you want to archive and compress the content of a given directory using the Deflate method, which is the most commonly used method in ZIP files. To do that, you can run the following code:

In this example, you pass 9 to compresslevel to get maximum compression. To provide this argument, you use a keyword argument . You need to do this because compresslevel isn’t the fourth positional argument to the ZipFile initializer.

Note: The initializer of ZipFile takes a fourth argument called allowZip64 . It’s a Boolean argument that tells ZipFile to create ZIP files with the .zip64 extension for files larger than 4 GB.

After running this code, you’ll have a comp_dir.zip file in your current directory. If you compare the size of that file with the size of your original sample.zip file, then you’ll note a significant size reduction.

Creating ZIP files sequentially can be another common requirement in your day-to-day programming. For example, you may need to create an initial ZIP file with or without content and then append new member files as soon as they become available. In this situation, you need to open and close the target ZIP file multiple times.

To solve this problem, you can use ZipFile in append mode ( "a" ), as you have already done. This mode allows you to safely append new member files to a ZIP archive without truncating its current content:

In this example, append_member() is a function that appends a file ( member ) to the input ZIP archive ( zip_file ). To perform this action, the function opens and closes the target archive every time you call it. Using a function to perform this task allows you to reuse the code as many times as you need.

The get_file_from_stream() function is a generator function simulating a stream of files to process. Meanwhile, the for loop sequentially adds member files to incremental.zip using append_member() . If you check your working directory after running this code, then you’ll find an incremental.zip archive containing the three files that you passed into the loop.

One of the most common operations you’ll ever perform on ZIP files is to extract their content to a given directory in your file system. You already learned the basics of using .extract() and .extractall() to extract one or all the files from an archive.

As an additional example, get back to your sample.zip file. At this point, the archive contains four files of different types. You have two .txt files and two .md files. Now say you want to extract only the .md files. To do so, you can run the following code:

The with statement opens sample.zip for reading. The loop iterates over each file in the archive using namelist() , while the conditional statement checks if the filename ends with the .md extension. If it does, then you extract the file at hand to a target directory, output_dir/ , using .extract() .

Exploring Additional Classes From zipfile

So far, you’ve learned about ZipFile and ZipInfo , which are two of the classes available in zipfile . This module also provides two more classes that can be handy in some situations. Those classes are zipfile.Path and zipfile.PyZipFile . In the following two sections, you’ll learn the basics of these classes and their main features.

When you open a ZIP file with your favorite archiver application, you see the archive’s internal structure. You may have files at the root of the archive. You may also have subdirectories with more files. The archive looks like a normal directory on your file system, with each file located at a specific path.

The zipfile.Path class allows you to construct path objects to quickly create and manage paths to member files and directories inside a given ZIP file. The class takes two arguments:

root accepts a ZIP file, either as a ZipFile object or a string-based path to a physical ZIP file.
at holds the location of a specific member file or directory inside the archive. It defaults to the empty string, representing the root of the archive.

With your old friend sample.zip as the target, run the following code:

This code shows that zipfile.Path implements several features that are common to a pathlib.Path object. You can get the name of the file with .name . You can check if the path points to a regular file with .is_file() . You can check if a given file exists inside a particular ZIP file, and more.

Path also provides an .open() method to open a member file using different modes. For example, the code below opens hello.txt for reading:

With Path , you can quickly create a path object pointing to a specific member file in a given ZIP file and access its content immediately using .open() .

Just like with a pathlib.Path object, you can list the content of a ZIP file by calling .iterdir() on a zipfile.Path object:

It’s clear that zipfile.Path provides many useful features that you can use to manage member files in your ZIP archives in almost no time.

Another useful class in zipfile is PyZipFile . This class is pretty similar to ZipFile , and it’s especially handy when you need to bundle Python modules and packages into ZIP files. The main difference from ZipFile is that the initializer of PyZipFile takes an optional argument called optimize , which allows you to optimize the Python code by compiling it to bytecode before archiving it.

PyZipFile provides the same interface as ZipFile , with the addition of .writepy() . This method can take a Python file ( .py ) as an argument and add it to the underlying ZIP file. If optimize is -1 (the default), then the .py file is automatically compiled to a .pyc file and then added to the target archive. Why does this happen?

Since version 2.3, the Python interpreter has supported importing Python code from ZIP files , a capability known as Zip imports . This feature is quite convenient. It allows you to create importable ZIP files to distribute your modules and packages as a single archive.

Note: You can also use the ZIP file format to create and distribute Python executable applications, which are commonly known as Python Zip applications. To learn how to create them, check out Python’s zipapp: Build Executable Zip Applications .

PyZipFile is helpful when you need to generate importable ZIP files. Packaging the .pyc file rather than the .py file makes the importing process way more efficient because it skips the compilation step.

Inside the python-zipfile/ directory, you have a hello.py module with the following content:

This code defines a function called greet() , which takes name as an argument and prints a greeting message to the screen. Now say you want to package this module into a ZIP file for distribution purposes. To do that, you can run the following code:

In this example, the call to .writepy() automatically compiles hello.py to hello.pyc and stores it in hello.zip . This becomes clear when you list the archive’s content using .printdir() .

Once you have hello.py bundled into a ZIP file, then you can use Python’s import system to import this module from its containing archive:

The first step to import code from a ZIP file is to make that file available in sys.path . This variable holds a list of strings that specifies Python’s search path for modules. To add a new item to sys.path , you can use .insert() .

For this example to work, you need to change the placeholder path and pass the path to hello.zip on your file system. Once your importable ZIP file is in this list, then you can import your code just like you’d do with a regular module.

Finally, consider the hello/ subdirectory in your working folder. It contains a small Python package with the following structure:

The __init__.py module turns the hello/ directory into a Python package. The hello.py module is the same one that you used in the example above. Now suppose you want to bundle this package into a ZIP file. If that’s the case, then you can do the following:

The call to .writepy() takes the hello package as an argument, searches for .py files inside it, compiles them to .pyc files, and finally adds them to the target ZIP file, hello.zip . Again, you can import your code from this archive by following the steps that you learned before:

Because your code is in a package now, you first need to import the hello module from the hello package. Then you can access your greet() function normally.

Python’s zipfile also offers a minimal command-line interface that allows you to access the module’s main functionality quickly. For example, you can use the -l or --list option to list the content of an existing ZIP file:

This command shows the same output as an equivalent call to .printdir() on the sample.zip archive.

Now say you want to create a new ZIP file containing several input files. In that case, you can use the -c or --create option:

This command creates a new_sample.zip file containing your hello.txt , lorem.md , realpython.md files.

What if you need to create a ZIP file to archive an entire directory? For example, you may have your own source_dir/ with the same three files as the example above. You can create a ZIP file from that directory by using the following command:

With this command, zipfile places source_dir/ at the root of the resulting source_dir.zip file. As usual, you can list the archive content by running zipfile with the -l option.

Note: When you use zipfile to create an archive from your command line, the library implicitly uses the Deflate compression algorithm when archiving your files.

You can also extract all the content of a given ZIP file using the -e or --extract option from your command line:

After running this command, you’ll have a new sample/ folder in your working directory. The new folder will contain the current files in your sample.zip archive.

The final option that you can use with zipfile from the command line is -t or --test . This option allows you to test if a given file is a valid ZIP file. Go ahead and give it a try!

There are a few other tools in the Python standard library that you can use to archive, compress, and decompress your files at a lower level. Python’s zipfile uses some of these internally, mainly for compression purposes. Here’s a summary of some of these tools:

Module	Description
	Allows compression and decompression using the library
	Provides an interface for compressing and decompressing data using the Bzip2 compression algorithm
	Provides classes and functions for compressing and decompressing data using the LZMA compression algorithm

Unlike zipfile , some of these modules allow you to compress and decompress data from memory and data streams other than regular files and archives.

In the Python standard library, you’ll also find tarfile , which supports the TAR archiving format. There’s also a module called gzip , which provides an interface to compress and decompress data, similar to how the GNU Gzip program does it.

For example, you can use gzip to create a compressed file containing some text:

Once you run this code, you’ll have a hello.txt.gz archive containing a compressed version of hello.txt in your current directory. Inside hello.txt , you’ll find the text Hello, World! .

A quick and high-level way to create a ZIP file without using zipfile is to use shutil . This module allows you to perform several high-level operations on files and collections of files. When it comes to archiving operations , you have make_archive() , which can create archives, such as ZIP or TAR files:

This code creates a compressed file called sample.zip in your working directory. This ZIP file will contain all the files in the input directory, source_dir/ . The make_archive() function is convenient when you need a quick and high-level way to create your ZIP files in Python.

Python’s zipfile is a handy tool when you need to read, write, compress, decompress, and extract files from ZIP archives . The ZIP file format has become an industry standard, allowing you to package and optionally compress your digital data.

The benefits of using ZIP files include archiving related files together, saving disk space, making it easy to transfer data over computer networks, bundling Python code for distribution purposes, and more.

In this tutorial, you learned how to:

Use Python’s zipfile to read, write, and extract existing ZIP files
Read metadata about the content of your ZIP files with zipfile
Create your own ZIP files to archive and compress your digital data

You also learned how to use zipfile from your command line to list, create, and extract your ZIP files. With this knowledge, you’re ready to efficiently archive, compress, and manipulate your digital data using the ZIP file format.

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

About Leodanis Pozo Ramos

Leodanis is an industrial engineer who loves Python and software development. He's a self-taught Python developer with 6+ years of experience. He's an avid technical writer with a growing number of articles published on Real Python and other sites.

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

What Do You Think?

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal . Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session . Happy Pythoning!

Keep Learning

Keep reading Real Python by creating a free account or signing in:

Already have an account? Sign-In

Almost there! Complete this form and click the button below to gain instant access:

Python's zipfile: Manipulate Your ZIP Files Efficiently (Materials)

🔒 No spam. We take your privacy seriously.

Creating a Zip Archive of a Directory in Python

Introduction

When dealing with large amounts of data or files, you might find yourself needing to compress files into a more manageable format. One of the best ways to do this is by creating a zip archive.

In this article, we'll be exploring how you can create a zip archive of a directory using Python. Whether you're looking to save space, simplify file sharing, or just keep things organized, Python's zipfile module provides a to do this.

Creating a Zip Archive with Python

Python's standard library comes with a module named zipfile that provides methods for creating, reading, writing, appending, and listing contents of a ZIP file. This module is useful for creating a zip archive of a directory. We'll start by importing the zipfile and os modules:

Now, let's create a function that will zip a directory:

In this function, we first open a new zip file in write mode. Then, we walk through the directory we want to zip. For each file in the directory, we use the write() method to add it to the zip file. The os.path.relpath() function is used so that we store the relative path of the file in the zip file, instead of the absolute path.

Let's test our function:

After running this code, you should see a new file named archive.zip in your current directory. This zip file contains all the files from test_directory .

Note: Be careful when specifying the paths. If the zip_path file already exists, it will be overwritten.

Python's zipfile module makes it easy to create a zip archive of a directory. With just a few lines of code, you can compress and organize your files.

In the following sections, we'll dive deeper into handling nested directories, large directories, and error handling. This may seem a bit backwards, but the above function is likely what most people came here for, so I wanted to show it first.

Using the zipfile Module

In Python, the zipfile module is the best tool for working with zip archives. It provides functions to read, write, append, and extract data from zip files. The module is part of Python's standard library, so there's no need to install anything extra.

Here's a simple example of how you can create a new zip file and add a file to it:

In this code, we first import the zipfile module. Then, we create a new zip file named 'example.zip' in write mode ('w'). We add a file named 'test.txt' to the zip file using the write() method. Finally, we close the zip file using the close() method.

Creating a Zip Archive of a Directory

Creating a zip archive of a directory involves a bit more work, but it's still fairly easy with the zipfile module. You need to walk through the directory structure, adding each file to the zip archive.

We first define a function zip_directory() that takes a folder path and a ZipFile object. It uses the os.walk() function to iterate over all files in the directory and its subdirectories. For each file, it constructs the full file path and adds the file to the zip archive.

The os.walk() function is a convenient way to traverse directories. It generates the file names in a directory tree by walking the tree either top-down or bottom-up.

Note: Be careful with the file paths when adding files to the zip archive. The write() method adds files to the archive with the exact path you provide. If you provide an absolute path, the file will be added with the full absolute path in the zip archive. This is usually not what you want. Instead, you typically want to add files with a relative path to the directory you're zipping.

In the main part of the script, we create a new zip file, call the zip_directory() function to add the directory to the zip file, and finally close the zip file.

Working with Nested Directories

When working with nested directories, the process of creating a zip archive is a bit more complicated. The first function we showed in this article actually handles this case as well, which we'll show again here:

The main difference is that we're actually creating the zip directory outside of the function and pass it as a parameter. Whether you do it within the function itself or not is up to personal preference.

Handling Large Directories

So what if we're dealing with a large directory? Zipping a large directory can consume a lot of memory and even crash your program if you don't take the right precautions.

Luckily, the zipfile module allows us to create a zip archive without loading all files into memory at once. By using the with statement, we can ensure that each file is closed and its memory freed after it's added to the archive.

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

In this version, we're using the with statement when opening each file and when creating the zip archive. This will guarantee that each file is closed after it's read, freeing up the memory it was using. This way, we can safely zip large directories without running into memory issues.

Error Handling in zipfile

When working with zipfile in Python, we need to remember to handle exceptions so our program doesn't crash unexpectedly. The most common exceptions you might encounter are RuntimeError , ValueError , and FileNotFoundError .

Let's take a look at how we can handle these exceptions while creating a zip file:

FileNotFoundError is raised when the file we're trying to zip doesn't exist. RuntimeError is a general exception that might be raised for a number of reasons, so we print the exception message to understand what went wrong. zipfile.LargeZipFile is raised when the file we're trying to compress is too big.

Note: Python's zipfile module raises a LargeZipFile error when the file you're trying to compress is larger than 2 GB. If you're working with large files, you can prevent this error by calling ZipFile with the allowZip64=True argument.

Common Errors and Solutions

While working with the zipfile module, you might encounter several common errors. Let's explore some of these errors and their solutions:

FileNotFoundError

This error happens when the file or directory you're trying to zip does not exist. So always check if the file or directory exists before attempting to compress it.

IsADirectoryError

This error is raised when you're trying to write a directory to a zip file using ZipFile.write() . To avoid this, use os.walk() to traverse the directory and write the individual files instead.

PermissionError

As you probably guessed, this error happens when you don't have the necessary permissions to read the file or write to the directory. Make sure you have the correct permissions before trying to manipulate files or directories.

LargeZipFile

As mentioned earlier, this error is raised when the file you're trying to compress is larger than 2 GB. To prevent this error, call ZipFile with the allowZip64=True argument.

In this snippet, we're using the allowZip64=True argument to allow zipping files larger than 2 GB.

Compressing Individual Files

With zipfile , not only can it compress directories, but it can also compress individual files. Let's say you have a file called document.txt that you want to compress. Here's how you'd do that:

In this code, we're creating a new zip archive named compressed_file.zip and just adding document.txt to it. The 'w' parameter means that we're opening the zip file in write mode.

Now, if you check your directory, you should see a new zip file named compressed_file.zip .

Extracting Zip Files

And finally, let's see how to reverse this zipping by extracting the files. Let's say we want to extract the document.txt file we just compressed. Here's how to do it:

In this code snippet, we're opening the zip file in read mode ('r') and then calling the extractall() method. This method extracts all the files in the zip archive to the current directory.

Note: If you want to extract the files to a specific directory, you can pass the directory path as an argument to the extractall() method like so: myzip.extractall('/path/to/directory/') .

Now, if you check your directory, you should see the document.txt file. That's all there is to it!

In this guide, we focused on creating and managing zip archives in Python. We explored the zipfile module, learned how to create a zip archive of a directory, and even dove into handling nested directories and large directories. We've also covered error handling within zipfile, common errors and their solutions, compressing individual files, and extracting zip files.

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Monitor with Ping Bot

Reliable monitoring for your app, databases, infrastructure, and the vendors they rely on. Ping Bot is a powerful uptime and performance monitoring tool that helps notify you and resolve issues before they affect your customers.

Vendor Alerts with Ping Bot

Get detailed incident alerts about the status of your favorite vendors. Don't learn about downtime from your customers, be the first to know with Ping Bot.

Zip and unzip files with zipfile and shutil in Python

In Python, the zipfile module allows you to zip and unzip files, i.e., compress files into a ZIP file and extract a ZIP file.

zipfile — Work with ZIP archives — Python 3.11.4 documentation

You can also easily zip a directory (folder) and unzip a ZIP file using the make_archive() and unpack_archive() functions from the shutil module.

shutil - Archiving operations — High-level file operations — Python 3.11.4 documentation

Zip a directory (folder): shutil.make_archive()

Unzip a zip file: shutil.unpack_archive(), basics of the zipfile module: zipfile objects, create a new zipfile object, add files with the write() method, add existing files with the write() method, create and add a new file with the open() method, check the list of files in a zip file, extract individual files from a zip file, read files in a zip file, execute command with subprocess.run().

See the following article on the built-in zip() function.

zip() in Python: Get elements from multiple lists

All sample code in this article assumes that the zipfile and shutil modules have been imported. They are included in the standard library, so no additional installation is required.

You can zip a directory (folder) into a ZIP file using shutil.make_archive() .

shutil.make_archive() — High-level file operations — Python 3.11.4 documentation

Here are the first three arguments for shutil.make_archive() in order:

base_name : Path for the ZIP file to be created (without extension)
format : Archive format. Options are 'zip' , 'tar' , 'gztar' , 'bztar' , and 'xztar'
root_dir : Path of the directory to compress

For example, suppose there is a directory dir_zip with the following structure in the current directory.

Compress this directory into a ZIP file archive_shutil.zip in the current directory.

In this case, the specified directory dir_zip itself is not included in archive_shutil.zip .

If you want to include the directory itself, specify the parent directory of the target for the third argument root_dir , and the relative path from root_dir to the target directory for the fourth argument base_dir .

shutil - Archiving example with base_dir — High-level file operations — Python 3.11.4 documentation

See the next section for the result of unzipping.

You can unzip a ZIP file and extract all its contents using shutil.unpack_archive() .

shutil.unpack_archive() — High-level file operations — Python 3.11.4 documentation

The first argument filename is the path of the ZIP file, and the second argument extract_dir is the path of the target directory where the archive is extracted.

If extract_dir is omitted, the archive is extracted to the current directory.

Here, extract the ZIP file compressed in the previous section.

It is extracted as follows:

While the documentation doesn't explicitly mention it, it appears that a new directory is created even if extract_dir doesn't exist (as confirmed in Python 3.11.4).

The ZIP file created by shutil.make_archive() with base_dir is extracted as follows:

The zipfile module provides the ZipFile class, which allows you to create, read, and write ZIP files.

zipfile - ZipFile Objects — Work with ZIP archives — Python 3.11.4 documentation

The constructor zipfile.ZipFile(file, mode, ...) is used to create ZipFile objects. Here, file represents the path of a ZIP file, and mode can be 'r' for read, 'w' for write, or 'a' for append.

The ZipFile object needs to be closed with the close() method, but if you use the with statement, it is closed automatically when the block is finished.

The usage is similar to reading and writing files with the built-in open() function; you can specify the mode and use the with statement.

Read, write, and create files in Python (with and open())

Specific examples are described in the following sections.

Compress individual files into a ZIP file

To compress individual files into a ZIP file, create a new ZipFile object and add the files you want to compress with the write() method.

With zipfile.ZipFile() , provide the path of the ZIP file you want to create as the first argument file and set the second argument mode to 'w' for writing.

In write mode, you can specify the compression method and level using the compression and compresslevel arguments. The available options for compression are as follows; BZIP2 and LZMA have a higher compression ratio, but it takes longer to compress.

zipfile.ZIP_STORED : No compression (default)
zipfile.ZIP_DEFLATED : Usual ZIP compression
zipfile.ZIP_BZIP2 : BZIP2 compression
zipfile.ZIP_LZMA : LZMA compression

For ZIP_DEFLATED , compresslevel corresponds to the level of zlib.compressobj() . The default is -1 ( Z_DEFAULT_COMPRESSION ).

level is the compression level – an integer from 0 to 9 or -1 . A value of 1 (Z_BEST_SPEED) is fastest and produces the least compression, while a value of 9 (Z_BEST_COMPRESSION) is slowest and produces the most. 0 (Z_NO_COMPRESSION) is no compression. The default value is -1 (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression (currently equivalent to level 6). zlib.compressobj() — Compression compatible with gzip — Python 3.11.4 documentation

cpython/Lib/zipfile.py at v3.11.4 · python/cpython · GitHub

To add a file to the ZipFile object, use the write() method.

zipfile.ZipFile.write() — Work with ZIP archives — Python 3.11.4 documentation

The first argument filename is the path to the file to be added. The second argument arcname specifies the name of the file in the ZIP archive; if arcname is omitted, the name of filename is used. In addition, arcname can be used to define a directory structure.

You can also select a compression method and level for each file by specifying compress_type and compresslevel in the write() method.

Add other files to an existing ZIP file

To append additional files to an existing ZIP file, create a ZipFile object in append mode. Specify the path of the existing ZIP file for the first argument file and 'a' (append) for the second argument mode of zipfile.ZipFile() .

You can add existing files with the write() method of the ZipFile object.

The following is an example of adding another_file.txt in the current directory. The argument arcname is omitted.

You can also create a new file within the ZIP and add content to it. Create the ZipFile object in append mode ( 'a' ) and use its open() method.

zipfile.ZipFile.open() — Work with ZIP archives — Python 3.11.4 documentation

Specify the path of the file you want to create within the ZIP as the first argument, and set the second argument mode to 'w' of the open() method.

You can write the contents with the write() method of the opened file object.

The argument for the write() method should be specified in bytes , not str . To write text, either use the b'...' notation or convert the string using its encode() method.

An example of reading a file from a ZIP using the open() method of the ZipFile object will be described later.

To check the contents of a ZIP file, create a ZipFile object in read mode ( 'r' , default).

You can get a list of archived items with the namelist() method of the ZipFile object.

zipfile.ZipFile.namelist() — Work with ZIP archives — Python 3.11.4 documentation

As seen from the above results, ZIP files created with shutil.make_archive() list directories individually. The same behavior is observed with ZIP files compressed using the standard function of the Finder on a Mac.

You can exclude directories with list comprehensions.

List comprehensions in Python

To unzip a ZIP file, create a ZipFile object in read mode ( 'r' , default).

If you want to extract only specific files, use the extract() method.

zipfile.ZipFile.extract() — Work with ZIP archives — Python 3.11.4 documentation

The first argument member specifies the name of the file to be extracted (including its directory structure within the ZIP file), and the second argument path specifies the directory to extract the file.

If you want to extract all files, use the extractall() method. Specify the path of the directory to extract to as the first argument path .

zipfile.ZipFile.extractall() — Work with ZIP archives — Python 3.11.4 documentation

In both cases, if path is omitted, files are extracted to the current directory. Although the documentation doesn't specify it, it seems to create a new directory even if path is non-existent (confirmed in Python 3.11.4).

You can directly read files in a ZIP file.

Create a ZipFile object in read mode (default) and open the file inside with the open() method.

The first argument of open() is the name of a file in the ZIP (it may include the directory). The second argument mode can be omitted since the default value is 'r' (read).

You can read the contents using the read() method of the opened file object. The method returns a byte string ( bytes ), which can be converted to a string ( str ) using the decode() method.

Besides read() , you can also use readline() and readlines() with the file object, just like when using the built-in open() function.

ZIP with passwords (encryption and decryption)

The zipfile module supports decryption of password-protected (encrypted) ZIP files, but it does not support encryption.

It supports decryption of encrypted files in ZIP archives, but it currently cannot create an encrypted file. Decryption is extremely slow as it is implemented in native Python rather than C. zipfile — Work with ZIP archives — Python 3.11.4 documentation

Also, AES is not supported.

The zipfile module from the Python standard library supports only CRC32 encrypted zip files (see here: http://hg.python.org/cpython/file/71adf21421d9/Lib/zipfile.py#l420 ). zip - Python unzip AES-128 encrypted file - Stack Overflow

Both shutil.make_archive() and shutil.unpack_archive() do not support encryption or decryption.

The pyzipper library, as introduced on Stack Overflow above, supports AES encryption and decryption and can be used similarly to the zipfile module.

danifus/pyzipper: Python zipfile extensions

To create a ZIP file with a password, specify encryption=pyzipper.WZ_AES with pyzipper.AESZipFile() and set the password with the setpassword() method. Note that you need to specify the password as a byte string ( bytes ).

The following is an example of unzipping a ZIP file with a password.

Of course, if the password is wrong, it cannot be decrypted.

The zipfile module also allows you to specify a password, but as mentioned above, it does not support AES.

If zipfile or pyzipper don't work for your needs, you can use subprocess.run() to handle the task using command-line tools.

subprocess.run() — Subprocess management — Python 3.11.4 documentation

For example, use the 7z command of 7-zip (installation required).

Equivalent to the following commands. -x is expansion. Note that -p<password> and -o<directory> do not require spaces.

Related Categories

Zip File Operations

Table of contents, how to zip a file in python, unzip a file in python.

Zip files are a popular way to compress and bundle multiple files into a single archive. They are commonly used for tasks such as file compression, data backup, and file distribution. Zipping or compressing files in Python is a useful way to save space on your hard drive and make it easier to transfer files over the internet.

The zipfile module in Python provides functionalities to create, read, write, append, and extract ZIP files.

Zip a Single File

You can use the zipfile module to create a zip file containing a single file. Here is how you can do it:

In the above code, we first imported the zipfile module. Then we defined the name of the zip file and the name of the source file. We created a ZipFile object and added the source file to it using the write() method. We then closed the zip file using the close() method.

Zip Multiple Files

You can also create a zip file containing multiple files. Here is an example:

In the above example, we defined the names of multiple source files in a list. We then added each of these files to the zip file using a for loop and the write() method. Finally, we closed the zip file using the close() method.

To compress the zip file even further, you can set the compress_type argument to zipfile.ZIP_DEFLATED . This applies the DEFLATE compression method to the files being zipped.

It is straightforward to extract zip files in Python using the zipfile module. Here are two ways to do it:

In this example, we first import the zipfile module. We then create an instance of the ZipFile class for the zip file we want to extract. The r argument indicates that we want to read from the zip file, and myzipfile.zip is the name of the file we want to extract.

The extractall() method extracts all files from the zip file and saves them into the specified destination_folder . If destination_folder does not exist, it will be created.

In this example, we again import the zipfile module and create an instance of the ZipFile class. We then loop through all files in the zip file using namelist() . If a file has a .txt extension, we extract it to destination_folder .

By using these two code examples, you can easily extract files from zip files in Python. Remember to adjust the file paths and naming to fit your specific needs.

Contribute with us!

Do not hesitate to contribute to Python tutorials on GitHub: create a fork, update content and issue a pull request.

Learn Python Programming from Scratch

Learn Python

Python Zip | Zipping Files With Python

FREE Online Courses: Your Passport to Excellence - Start Now

Zip is one of the most widely used archive file formats. Since it reconstructs the compressed data perfectly, it supports lossless compression. A zip file can either be a single file or a large directory with multiple files and sub-directories. Its ability to reduce a file size makes it an ideal tool for storing and transferring large-size files. It is most commonly used in sending mails. To make working with zip files feasible and easier, Python contains a module called zipfile. With the help of zipfile, we can open, extract, and modify zip files quickly and simply. let’s learn about Python Zip!!

Viewing the Members of Python Zip File

All files and directories inside a zip file are called members. We can view all the members in a zip file by using the printdir() method. The method also prints the last modified time and the size of each member.

Example of using prindir() method in Python

In the above code example, Since we used the with statement, we don’t need to worry about opening and closing the zip file.

Zipping a File in Python

We can zip a specific file by using the write() method from the zipfile module.

Example of zipping a file in Python

Zipping All Files in Python Directory

To zip all files in a directory, we need to traverse every file in the directory and zip one by one individually using the zipfile module.

We can use the walk() function from the os module to traverse through the directory.

Example of zipping all files in a directory in Python

Although the above code does the job, there is a more efficient way of zipping an entire directory. To do this, we use the shutil module instead of zipfile module.

Extracting a File from Python Zip File

The module zipfile contains the method extract() to extract a certain file from the zip file.

Example of extracting a file from the zip file in Python

Extracting All Files from the Zip Directory

Instead of extracting one file at a time, we can extract all files at once by using the method extractall() from the zipfile module.

Example of extracting all files from the zip directory in Python

Accessing the Info of a Zip File

We can use infolist() method from the zipfile module to get information on a zip file.

The method returns a list containing the ZipInfo objects of members in the zip file.

Example of using infolist() in Python

Adding a File to an Existing Zip File

Opening the zip file in append mode ‘a’ instead of write mode ‘w’ enables us to add files to the zip file without replacing the existing files.

Example of adding files to an existing zip file in Python

Checking Whether a File is Zip File in Python

The function is_zipfile() is used to check whether a file is a zip file or not.

We need to pass the filename of the file that we want to check.

It returns True if the passed file is zip, otherwise, it returns False.

Example of using is_zipfile() in Python

Methods of Python ZipFile Object

The function is used to view all files in a zip file. It returns a list containing the names of all files in a zip file including the files present in the sub-directories.

Example of using namelist() in Python

setpassword(pwd)

The function is used to set a password to a zip file. We need to pass a byte string containing the password. The passed password will be set to the zip file. It has no return value.

Example of using setpassword() in Python

This function verifies all of the files in the archive for CRCs and file headers, then provides the name of the first problematic file. It returns None if there isn’t any.

Example of using testzip() in Python

The function is used to write a new file with new data intp a zip file. We need to pass a string containing the file name and a string or byte string containing the data of the new file. The passed file containing the passed data will be created in the zip file. It has no return value.

Example of using writestr() in Python

Attributes of Python ZipFile Object

The attribute returns an integer containing the level of debug output we need to use. The returned value 0 represents no output while the 3 represents most output.

Example of using debug in Python

The attribute returns a byte string containing the comment supplied with the zip file.

Example of using comment in Python

Python Interview Questions on Python Zip Files

Q1. Write a program to print all the contents of the zip file ‘EmployeeReport.zip’.

Ans 1. Complete code is as follows:

Q2. Write a program to zip a directory ‘EmployeeReport’ without using any loops.

Ans 2. Complete code is as follows:

Q3. Write a function to check whether a list of files is zip files or not. The function should return True if all files in the list are zip files, otherwise, it should return False.

Ans 3. Complete code is as follows:

Q4. Write a program to extract the zip file ‘EmployeeReport.zip’.

Ans 4. Complete code is as follows:

Q5. Write a program to add three files to an existing zip file ‘EmployeeReport.zip’. The three files are: ‘file1.txt’, ‘file2.txt’, and ‘file3.txt’.

Ans 5. Complete code is as follows:

Quiz on Zip Files in Python

Quiz summary.

0 of 10 Questions completed

Information

You have already completed the quiz before. Hence you can not start it again.

Quiz is loading…

You must sign in or sign up to start the quiz.

You must first complete the following:

Quiz complete. Results are being recorded.

0 of 10 Questions answered correctly

Time has elapsed

You have reached 0 of 0 point(s), ( 0 )

Earned Point(s): 0 of 0 , ( 0 ) 0 Essay(s) Pending (Possible Point(s): 0 )

Not categorized 0%
Review / Skip

1 . Question

The files and directories in a zip file are called ____.

Zip directories
Member files

2 . Question

What module is generally used for working with zip files in Python?

Zipfile module
Shutil module

3 . Question

What function is used to view all contents of a zip file?

4 . Question

What function is used to set a password to a zip file?

setpassword()
getpassword()
Both A and B

5 . Question

_____ attribute is used to view the comment of a zip file.

getcomment()

6 . Question

What function is used to extract a single from a zip file?

extractone()
extractall()
extractfile()

7 . Question

What function is used to extract all files from a zip file?

8 . Question

from zipfile import ZipFile

file = “students.zip”

with ZipFile(file, ‘r’) as zip:

print(zip.namelist())

The code prints a list of only directories in the zip file
The code prints a list of names of all files in a zip file
The code prints a list of names and sizes of all files in a zip file
The code prints a list of names, sizes, and modification times of all files in a zip file

9 . Question

import zipfile

print(zipfile.is_zipfile(“file_zip.txt”))

None of the above

10 . Question

with ZipFile(“Students.zip”, ‘a’) as zip:

zip.write(‘NewStudent.txt’)

The code removes the existing files and adds the new file to the zip file
The code removes only one file and adds the new file to the zip file
The code adds the new file to the zip file without removing any files
The code raises a runtime error

In this article, we learned how to create, modify and append zip files using Python. We discussed various ways of zipping a directory. We also learned how to check whether a file is a zip file or not using Python. Moreover, if you have any comments, kindly leave them in the comments section.

Did you know we work 24x7 to provide you best tutorials Please encourage us - write a review on Google | Facebook

Tags: Attributes of Python ZipFile Object Methods of Python ZipFile Object Python Zip Python Zip File Python ZipFile Object Zipping files in Python

How to Zip and Unzip Files in Python

November 21, 2022 February 10, 2023

In many cases, you’ll encounter zipped files (or need to zip files yourself). Because zip files are so common, being able to work with them programmatically is an important skill. In this tutorial, you’ll learn how to zip and unzip files using Python .

By the end of this tutorial, you’ll have learned:

How to zip files using Python
How to add files to an existing zip file
How to zip files with password protection (encryption)
How to unzip one file, some files, or all files in a zip file
How to unzip files conditionally

Table of Contents

Understanding the Python zipfile Module

Python comes packaged with a module, zipfile, for creating, reading, writing, and appending to a zip file. Because the module is built into Python, there is no need to install anything. In particular, the zipfile module comes with a class, ZipFile , which has many helpful methods to allow you to read and write zip files in Python.

Let’s dive into using the module to first learn how to zip files using the zipfile module.

How to Zip Files Using Python

In order to zip files using Python, we can create a zip file using the Zipfile class. When we open the Zipfile for the first time, Python will create the file if it doesn’t already exist. Because of this, we can simply call the open action without needing to create a file first.

From there, we can use the .write() method to add individual files to a zip file. In order to make this process easier, we can use a for loop to pass in multiple files in a single action . If we want to add all files in a directory, we can create a list of all the files in a folder.

Let’s see how we can use Python to zip files, using all files in a directory:

In the example above, we used the os.listdir() function to list out all the files in a directory . Then, we used the with context manager to open the file. The benefit of this is that Python will automatically close the file when all operations are done.

From there, we looped over the list of files and wrote each of them to the zip file using the .write() method.

How to Add Files to an Existing Zip File in Python

Adding files to an exciting zip file is made easy using the Zipfile class. When you instantiate a Zipfile object, you can open it in append mode using the 'a' mode. From there, the process works in the same way as though you were adding files to a newly created file.

Let’s see how we can use the Zipfile class to add a file to an existing zip file:

Notice that in the code block above that we use a second argument, arcname= . Let’s take a look at the difference between these two parameters:

filename= represents the file that you want to add into your zip file
arcname= represents the path and filename you want to use in the archive itself.

If we were to leave the second argument blank, then it would use the filename value. This would replicate the whole folder structure, meaning that, in our case, Python would add the Users/datagy/old/ directory.

How to Unzip a Zip File in Python

To unzip a zip file using Python, you can use the .open() method and use the 'r' option to read the file. This simply opens the file in read mode. However, how can we see what’s inside it?

That’s where the .printdir() method comes into play. The method will print out the contents and some information about the contents. Let’s take a look at what this looks like:

Now we can see that our zip file has four different files.

How to Extract a Single File From a Zipfile in Python

In order to extract a file, we can use the .extract() method. The method takes both the filename that you want to extract and the destination folder.

Let’s take a closer look at method in more detail:

We can see that the method has three different parameters. Let’s break these down in more detail:

member= is the file you want to extract
path= is the destination of where you want to place the file
pwd= is the password if the file is encrypted (which we’ll cover soon)

Let’s see what this looks like to extract a single file from a zip file in Python:

Note that we’re opening the zip file using the 'r' method, indicating we want to read the file. We then instruct Python to extract file1.txt and place it into the /Users/datagy/ directory.

In many cases, you’ll want to extract more than a single file. In the following section, you’ll learn how to extract all files from a zipfile in Python.

How to Extract All Files From a Zipfile in Python

In order to extract all files from a zip file in Python, you can use the .extractall() method. The method only requires that you pass in the destination path to where you want to extract all of the members of the zip file.

Let’s take a look at how to extract all files from a zip file in Python:

In the example above, we can see how simple it is to extract all files from a zip file. We first open the zip file using the context manager. From there, we simply pass the path of where we want to extract the files to into the .extractall() method.

In the next section, you’ll learn how to decrypt encrypted files.

How to Decrypt a Password-Protected Zip File Using Python

The Python zipfile library lets you easily decrypt zip files that have a password. When you’re working with a file that is password protected and simply try to get a file, Python will raise a RunTimeError . Let’s see what this looks like:

Let’s see how we can pass in a password to extract the files from a password-protected zip file. The .extractall() and .extract() methods both accept an additional parameter, pwd= . This parameter accepts the password of the zip file, or of the individual file itself.

Let’s see how we can use Python to extract files from a password-protected file:

In the example above, we pass in the password in the pwd= parameter. Note that we’re using a binary string, as noted by prefixing a b in front of the string itself.

How to Unzip Files Conditionally in Python

In this section, you’ll learn how to unzip files conditionally. This can be helpful if you have a large zip file and only want to extract files that are of a certain file type. Similarly, you could use the method you’ll learn to only extract files that are smaller than a certain filesize .

In order to extract files conditionally, we can loop over each file in the zip file and see if it meets the condition. Let’s see how we can use Python to extract only jpg files from our zip file.

Let’s break down how to extract files conditionally from a zip file in Python:

Open the zip file using the read method
We then get a list of all the files in the zip file using the .namelist() method
We then loop over each file and get the filename and extension for each file to check its extension
If the extension is matches the desired extension,then we extract it using the .extract() method

In this tutorial, you learned how to use Python to zip and unzip files. Being able to work with compressed files is a common practice regardless of the industry you work in. For example, working in data science, you’ll often find transactional records compressed to save space.

You first learned how to create a zip file and how to add a single file or all files in the zip file. From there, you learned how to open zip files and get either a single file or all files. Then, you learned how to handle files that are password protected. Finally, you learned how to extract files conditionally, such as files of a certain type.

Additional Resources

To learn more about related topics, check out the tutorials below:

Python Delete a File or Directory: A Complete Guide
Python: Copy a File (4 Different Ways)
How to Move Files in Python (os, shutil)
How to Rename Files in Python with os.rename()
Python zipfile: Official Documentation

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials. View Author posts

Python Zip File Example – Working With Zip Files In Python

Hey Everyone, welcome to Python Zip File Example Tutorial. In this tutorial you will learn what is zip file, why we use them, how to create a zip file, how to write files to a zip file and many more.

1.1 What Is Zip File ?
1.2.1 Downloading softwares
1.2.2 Downloading And Sharing Photos
2.1 Creating New Project
2.2 zipfile Module
2.3 shutil Module
2.4 Creating Zip File
2.5 Writing To Zip File
2.6 Reading Zip Files
3.1.1 Syntax
3.2 extractall( )
3.3 Extracting Zip File With Password
4 Conclusion

Python Zip File Example – An Introduction To Zip Files

What is zip file .

A file with the zip file extension is called zip file.
ZIP is an archive file format that supports lossless data compression . A ZIP file may contain one or more files or directories that may have been compressed.
The ZIP file format permits a number of compression algorithms , though DEFLATE is the most common.
A ZIP file, like other archive file formats, is simply a collection of one or more files and/or folders but is compressed into a single file for easy transportation and compression.

Why We Use Zip Files ?

These are the most common use of Zip files –

Downloading softwares

The most common use for ZIP files is for software downloads.
Zipping a software program saves storage space on the server, decreases the time it takes for you to download it to your computer, and keeps the hundreds or thousands of files nicely organized in the single ZIP file.

Downloading And Sharing Photos

Instead of sending each image individually over email or saving each image one by one from a website, the sender can put the files in a ZIP archive so that only one file needs to be transferred.

Python Zip File Example – Creating, Writing, Reading

So guys in this section you will see how to create and read/write to zip files in python. Let’s see how to do that.

Creating New Project

So now open your python IDE( check best IDEs ), and create a new project and inside this project create a new python file.

zipfile Module

The first thing you need to work with zip files in python is zipfile module. This module provides tools to create, read, write, append, and list a ZIP file.

This module does not currently handle multi-disk ZIP files . It can handle ZIP files that use the ZIP64 extensions (ZIP files that are more than 4 GByte in size).

It supports decryption of encrypted files in ZIP archives, but it currently cannot create an encrypted file.

Decryption is extremely slow as it is implemented in native Python rather than C.

Zipfile is an inbuilt python module so you don’t need to install it via pip install.

shutil Module

shutil is the second module, by using this you can work with zip files using python.
This module allows for quickly compressing an entire directory into a zip archive.

But in this tutorial we will work only with zipfile module.

Creating Zip File

So now you will learn how to create a zip file using python. So write the following code.

zipfile = zipfile.ZipFile("F:/NewZipfile.zip", mode='w', compression=zipfile.ZIP_DEFLATED) (my_zipfile) .close()

Code Explanation

The first thing you need to do is importing zipfile module.
ZipFile is a class which is used for reading and writing zip files.
zipfile.ZipFile will create a zip file and inside this class you have to pass name of zip file that you want to create with full path and then mode and compression.
In mode , you have to pass w because you are writing a new file.
compression is the ZIP compression method to use when writing the archive. ZIP_DEFLATED is the numeric constant for the usual ZIP compression method. This requires the zlib module.
Now you have to close the zip file. You must call close() before exiting your program or essential records will not be written.

Now run the above code. You will see your zipfile has been created. Let’s see my zip file named NewZipfile .

Writing To Zip File

Now you need to write into the zip file which you have created. So for this write the following code.

zipfile = zipfile.ZipFile("F:/NewZipfile.zip", mode='w', compression=zipfile.ZIP_DEFLATED) .write("F:/doc.txt") .write("F:/code.txt") .close()

write command is used to write files into zip files.
It overrides all the existing files in the Zip .
Now you have to create file which you want to write to the zip file.
Now run the program and let’s see what happens.

You can see two text files has been written into zip file.

Reading Zip Files

Now to read zip files, you have to open your file in reading(r) mode.

read command is used for reading zip files.

Now write the following code for reading zip file.

zipfile = zipfile.ZipFile("F:/NewZipfile.zip", mode='r', compression=zipfile.ZIP_DEFLATED) ("\n",my_zipfile.read('code.txt')) .close()

Now run the above code and check the output.

Let’s learn An Introduction To MongoDB With Python

Python Zip File Example – Extracting Zip Files

You can extract zip file using two ways –

extract( )

extractall( )

.extract(member[, path[, pwd]])

It extract a member from the archive to the current working directory.
member must be its full name or a ZipInfo object.
path specifies a different directory to extract to.
pwd is the password used for encrypted files.

Let’s see how to implement it. So write the following code.

zipfile = zipfile.ZipFile("F:/NewZipfile.zip", mode='r') ('Extracting one file...') .extract("doc.txt") ('Extracting Done!') .close()

And the output of this code is following.

Now you can see in your project directory that your file is extracted.

It extract all members from the archive to the current working directory.

So write the following code to implement this.

zipfile = zipfile.ZipFile("F:/NewZipfile.zip", mode='r') ('Extracting all file...') .extractall() ('Extracting Done!') .close()

Now let’s run this.

Go to your project directory and check the extracted files. In my case it is as follows –

Have You Checked Speech Recognition Python Tutorial ?

Extracting Zip File With Password

So now, to extract a zip file with password, you have to pass the value to pwd which is positional argument of extract(pwd = password) or extractall(pwd = password) methods. So let’s see it with an example.

zipfile = zipfile.ZipFile("F:/locked.zip", mode='r') = "login" ('Extracting all file...') .extractall(pwd = bytes(password, 'utf-8')) ('Extracting Done!') .close()

Remember one thing, You must have to pass a password which is in bytes . Built-in method bytes with utf-8 encode format is used to convert a str to bytes.

Now run the code and check your project directory.

You can see the locked file is extracted.

So guys this was all about Python Zip File Example. In this tutorial, you have see working with zip files in python. I hope, you have learned a lot from this tutorial, if so then must share with others and help them learning python. And if you have query regarding this post then feel free to ask me in comment section. Your queries are most welcome. Thanks

How to create a zip file using Python?

ZIP is an archive file format used to for lossless data compression. One or more directories or files are used to create a ZIP file. ZIP supports multiple compression algorithms, DEFLATE being the most common. ZIP files have .zip as extension. In this article we are going to discuss how to create a Zip file using Python.

Creating uncompressed ZIP file in Python

Uncompressed ZIP files do not reduce the size of the original directory. As no compression is sharing uncompressed ZIP files over a network has no advantage as compared to sharing the original file.

Using shutil.make_archive to create Zip file

Python has a standard library shutil which can be used to create uncompressed ZIP files. This method of creating ZIP file should be used only to organize multiple files in a single file.

Following is the syntax of shutil.make_archive() −

Following is an example to create ZIP file using shutil.make_archive() −

Following is an output of the above code −

Creating compressed ZIP file in Python

Compressed ZIP files reduce the size of the original directory by applying compression algorithm. Compressed ZIP files result in faster file sharing over a network as the size of the ZIP file is significantly smaller than original file.

The zipfile library in python allows for creation of compressed ZIP files using different methods.

Creating ZIP file from multiple files

In this method, ZipFile() creates a ZIP file in which the files which are to be compressed are added. This is achieved by creating object of ZipFile using 'with' keyword and then writing the files using .write() method.

Following is an example to create ZIP file using multiple files −

Creating ZIP file from entire directory

In this method, a for loop is used to traverse the entire directory and then add all the files present in the directory to a ZIP file which is created using ZipFile.

Following is an example to create ZIP file from entire directory −

Creating ZIP file from specific files in a directory

In this method, lambda function is used to filter files with specific extensions to be added in the ZIP file. The lambda function is passed as parameter to a function in which the files are filtered based on the extension.

Following is an example to create ZIP file using specific files in a directory −

How are files added to a zip file using Python?
How to extract all the .txt files from a zip file using Python?
How to create a zip file and ignore directory structure in Linux?
How to create a tar file using Python?
How to Zip a File in Linux?
How to zip a folder recursively using Python?
How to create hardlink of a file using Python?
How to create softlink of a file using Python?
How to create a duplicate file of an existing file using Python?
How to create an empty file using Python?
How to create a unique temporary file name using Python?
Create a GUI to convert CSV file to Excel file using Python
Python Program to Read and printing all files from a zip file
How to create a File Chooser using JavaFX?
How to create a temporary file using PowerShell?

Kickstart Your Career

Get certified by completing the course

Difference between the equality operator and is operator in Python
Type Conversion in Python
How to work with tarball/tar files in Python
Difference between iterator and iterable in Python
Difference between set() and frozenset() in Python
How to use dotenv package to load environment variables in Python
How to count the occurrence of an element in a List in Python
How to format Strings in Python
How to use Poetry to manage dependencies in Python
Difference between sort() and sorted() in Python
What does the yield keyword do in Python
Data classes in Python with dataclass decorator
How to access and set environment variables in Python
Complete Guide to the datetime Module in Python
How to build CLIs using Quo
What are virtual environments in Python and how to work with them
What is super() in Python
Complex Numbers in Python
What is the meaning of single and double leading underscore in Python
Working with Videos in OpenCV using Python
In-place file editing with fileinput module
How to convert a string to float/integer and vice versa in Python
Working with Images in OpenCV using Python
What are metaclasses in Python
How to randomly select an item from a list?
Getting Started with OpenCV in Python
What are global, local, and nonlocal scopes in Python
What is self in Python classes
Create a Task Tracker App for the Terminal with Python (Rich, Typer, Sqlite3)
Introduction to Graph Machine Learning
How to check if an object is iterable in Python
How to slice sequences in Python
How to read and write files in Python
How to remove duplicate elements from a List in Python
How To Analyze Apple Health Data With Python
How to flatten a list of lists in Python
What is assert in Python
What are *args and **kwargs in Python
How to delete files and folders in Python
31 essential String methods in Python you should know
What is __init__.py file in Python
How to copy files in Python
Quick Python Refactoring Tips (Part 2)
How to ask the user for input until they give a valid response in Python
Master Pattern Matching In Python 3.10 | All Options |
Create a Note Taking App in Python with Speech Recognition and the Notion API
Python Automation Projects With Machine Learning
New Features In Python 3.10 You Should Know
5 Python Pitfalls that can save you HOURS of debugging!
What is the difference between append and extend for Python Lists?
10 Python Basics You Should Know!
How to concatenate two Lists in Python
Difference between __str__ and __repr__ in Python
Difference between @classmethod, @staticmethod, and instance methods in Python.
How to pad zeros to a String in Python
How to create a nested directory in Python
How to merge two Dictionaries in Python
How to execute a Program or System Command from Python
How to check if a String contains a Substring in Python
How to find the index of an item in a List in Python
How to access the index in a for loop in Python
How to check if a file or directory exists in Python
How to remove elements in a Python List while looping
What does if __name__ == "__main__" do?
The Best FREE Machine Learning Crash Courses
How to write while loops in Python
How to write for loops in Python
Quick Python Refactoring Tips
Async Views in Django 3.1
Build A Machine Learning iOS App | PyTorch Mobile Deployment
HuggingFace Crash Course
10 Deep Learning Projects With Datasets (Beginner & Advanced)
How To Deploy ML Models With Google Cloud Run
MongoDB Crash Course With Python
Why I Don't Care About Code Formatting In Python | Black Tutorial
Build A Machine Learning Web App From Scratch
Beautiful Terminal Styling in Python With Rich
How To Hack Neural Networks!
Should You Use FOR Or WHILE Loop In Python?
Learn NumPy In 30 Minutes
Quick Data Analysis In Python Using Mito
Autoencoder In PyTorch - Theory & Implementation
How To Scrape Reddit & Automatically Label Data For NLP Projects | Reddit API Tutorial
How To Build A Photo Sharing Site With Django
PyTorch Time Sequence Prediction With LSTM - Forecasting Tutorial
Create Conversational AI Applications With NVIDIA Jarvis
Create A Chatbot GUI Application With Tkinter
Build A Stock Prediction Web App In Python
Machine Learning From Scratch in Python - Full Course [FREE]
Awesome Python Automation Ideas
How To Edit Videos With Python
How To Schedule Python Scripts As Cron Jobs With Crontab (Mac/Linux)
Build A Website Blocker With Python - Task Automation Tutorial
How To Setup Jupyter Notebook In Conda Environment And Install Kernel
Teach AI To Play Snake - Practical Reinforcement Learning With PyTorch And Pygame
Python Snake Game With Pygame - Create Your First Pygame Application
PyTorch LR Scheduler - Adjust The Learning Rate For Better Results
Docker Tutorial For Beginners - How To Containerize Python Applications
Object Oriented Programming (OOP) In Python - Beginner Crash Course
FastAPI Introduction - Build Your First Web App
5 Machine Learning BEGINNER Projects (+ Datasets & Solutions)
Build A PyTorch Style Transfer Web App With Streamlit
How to use the Python Debugger using the breakpoint()
How to use the interactive mode in Python.
Support Me On Patreon
PyTorch Tutorial - RNN & LSTM & GRU - Recurrent Neural Nets
freeCodeCamp.org Released My Intermediate Python Course
PyTorch RNN Tutorial - Name Classification Using A Recurrent Neural Net
PyTorch Lightning Tutorial - Lightweight PyTorch Wrapper For ML Researchers
My Minimal VS Code Setup for Python - 5 Visual Studio Code Extensions
NumPy Crash Course 2020 - Complete Tutorial
Create & Deploy A Deep Learning App - PyTorch Model Deployment With Flask & Heroku
Snake Game In Python - Python Beginner Tutorial
11 Tips And Tricks To Write Better Python Code
Python Flask Beginner Tutorial - Todo App
Chat Bot With PyTorch - NLP And Deep Learning
Build A Beautiful Machine Learning Web App With Streamlit And Scikit-learn
Website Rebuild With Publish (Static Site Generator)
Build & Deploy A Python Web App To Automate Twitter | Flask, Heroku, Twitter API & Google Sheets API
How to work with the Google Sheets API and Python
TinyDB in Python - Simple Database For Personal Projects
How To Load Machine Learning Data From Files In Python
Regular Expressions in Python - ALL You Need To Know
Complete FREE Study Guide for Machine Learning and Deep Learning
Machine Learning From Scratch in Python
YouTube Data API Tutorial with Python - Analyze the Data - Part 4
YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
The Walrus Operator - New in Python 3.8
How To Add A Progress Bar In Python With Just One Line
List Comprehension in Python
Select Movies with Python - Web Scraping Tutorial
Download Images With Python Automatically - Web Scraping Tutorial
Anaconda Tutorial - Installation and Basic Commands

How to work with ZIP files in Python

Learn ZIP file manipulation with the zipfile module.

ZIP format is commonly used as a file archival as well as compression format which is supported across all platforms. Files can be compressed without losing any data. Python has built-in support for ZIP files.

In this article, we will learn how ZIP files can be read, written, extracted, and listed in Python.

List ZIP file contents ¶

The zipfile module in Python, a part of the built-in libraries, can be used to manipulate ZIP files. It is advised to work with file handlers inside a context manager as it takes care of file pointer closure.

To read a ZIP file we first create an instance of the ZipFile class and use the following methods to get file information:

Read specific files from ZIP ¶

After a ZIP file is read, use the open() method to read a specific file.

Adding files to a ZIP ¶

To add files we first open the ZIP file in append mode. It is important not to open it in write mode because then the entire ZIP will be overwritten!

import zipfile with zipfile . ZipFile ( "./data.zip" , "a" ) as zip : zip . write ( "app.py" , arcname = "python/app.py" ) Here, arcname is used to define the path of file inside the ZIP.

Extracting the contents ¶

Extraction is pretty simple. For this the file needs to be opened in read mode:

Extracting password protected ZIP:

FREE VS Code / PyCharm Extensions I Use

✅ Write cleaner code with Sourcery, instant refactoring suggestions: Link*

PySaaS: The Pure Python SaaS Starter Kit

🚀 Build a software business faster with pure Python: Link*

Programming Tutorials

Python: How to create a zip archive from multiple files or Directory

In this article we will discuss how to create a zip archive from selected files or files from a directory based on filters.

Python’s zipfile module provides a ZipFile class for zip file related stuff. Let’s use this to create a zip archive file.

First import the class from module i.e.

Create a zip archive from multiple files in Python

Steps are,

Create a ZipFile object by passing the new file name and mode as ‘w’ (write mode). It will create a new zip file and open it within ZipFile object.
Call write() function on ZipFile object to add the files in it.
call close() on ZipFile object to Close the zip file.

It will create a zip file ‘sample.zip’ with given files inside it. We can do the same thing with “with open” . It will automatically close the zip file when ZipFile object goes out of scope i.e.

Create a zip archive of a directory

To zip all the contents of a directory in a zip archive, we need to iterate over all the files in directory and it’s sub directories, then add each entry to the zip file using ZipFile.write()

Frequently Asked:

Python : How to remove a file if exists and handle errors | os.remove() | os.ulink()
Python- Find the largest file in a directory
Python : How to get the list of all files in a zip archive
Python: How to unzip a file | Extract Single, multiple or all files from a ZIP archive

It will zip all the contents of a directory in to a single zip file i..e ‘sampleDir.zip’. It’s contents will be,

Zip selected files from a directory based on filter or wildcards

To zip selected files from a directory we need to check the condition on each file path while iteration before adding it to zip file.

Let’s create function that Iterates over a directory and filter the contents with given callback. Files which pass the filter will only be added in zip i.e.

Let’s zip only csv files from a directory i.e. pass a lambda function as argument in it.

It will create a zip archive ‘sampleDir2.zip’ with all csv files from given directory.

Complete example is as follows:

Python : How to move files and Directories ?
Python : How to get Last Access & Creation date time of a file
Python : How to copy files from one location to another using shutil.copy()
Python : How to remove files by matching pattern | wildcards | certain extensions only ?
Python : How to delete a directory recursively using shutil.rmtree()
Python: Get list of files in directory sorted by date and time
Python Get list of files in directory with size
Python: Get list of files in directory sorted by size
Python: Get list of files in directory sorted by name
Find the smallest file in a directory using python
How to create a Directory in python ?
How to change current working directory in python ?
Python : Get Current Directory
Python : How to check if a directory is empty ?
Python : How to Get List of all empty Directories ?
Python : How to get list of files in directory and sub directories

Share your love

3 thoughts on “python: how to create a zip archive from multiple files or directory”.

I was having trouble finding simplified examples showing the Open, Write, Close process. Thank you for creating such a concise explanation!

# Zip the files from given directory that matches the filter from zipfile import ZipFile import os

def zipFilesInDir(dirName, zipFileName, filter): # create a ZipFile object with ZipFile(zipFileName, ‘w’) as zipObj: for folderName, subfolders, filenames in os.walk(“C:\\Users\\SainiV01\\Documents\\copy”):

for filename in filenames: # if filter(filename): # create complete filepath of file in directory filePath = os.path.join(folderName, filename) # Add file to zip zipObj.write(filePath)

zipFilesInDir(“C:\\Users\\SainiV01\\Documents\\copy”, ‘sampleDir.zip’, lambda name: ‘csv’ in name)

this code is not working as per the expection…in this deirectory there are 2 files i want to keep those files as a sampleDir.zip zip file. some could please help what i did wrong here

In the function zipFilesInDir(), while adding file in zip using write() function, we need to pass the arcname also i.e.

zipObj.write(filePath, basename(filePath))

Actually we were adding file in zip with complete path name, that was causing the issue. In examples above I used only files in local directory, therefore didn’t encountered this issue earlier. I have updated the code above. It should work fine now.

Thanks, Varun

Page Contents

Limitations
Testing ZIP Files
Reading Meta-data from a ZIP Archive
Extracting Archived Files From a ZIP Archive
Creating New Archives
Using Alternate Archive Member Names
Writing Data from Sources Other Than Files
Writing with a ZipInfo Instance
Appending to Files
Python ZIP Archives

Table of Contents Previous: tarfile – Tar archive access Next: zlib – Low-level access to GNU zlib compression library

Show Source

The output from all the example programs from PyMOTW has been generated with Python 2.7.8, unless otherwise noted. Some of the features described here may not be available in earlier versions of Python.

If you are looking for examples that work under Python 3, please refer to the PyMOTW-3 section of the site.

PyMOTW »
Data Compression and Archiving »

zipfile – Read and write ZIP archive files ¶

Purpose:	Read and write ZIP archive files.
Available In:	1.6 and later

The zipfile module can be used to manipulate ZIP archive files.

Limitations ¶

The zipfile module does not support ZIP files with appended comments, or multi-disk ZIP files. It does support ZIP files larger than 4 GB that use the ZIP64 extensions.

Testing ZIP Files ¶

The is_zipfile() function returns a boolean indicating whether or not the filename passed as an argument refers to a valid ZIP file.

Notice that if the file does not exist at all, is_zipfile() returns False.

Reading Meta-data from a ZIP Archive ¶

Use the ZipFile class to work directly with a ZIP archive. It supports methods for reading data about existing archives as well as modifying the archives by adding additional files.

To read the names of the files in an existing archive, use namelist() :

The return value is a list of strings with the names of the archive contents:

The list of names is only part of the information available from the archive, though. To access all of the meta-data about the ZIP contents, use the infolist() or getinfo() methods.

There are additional fields other than those printed here, but deciphering the values into anything useful requires careful reading of the PKZIP Application Note with the ZIP file specification.

If you know in advance the name of the archive member, you can retrieve its ZipInfo object with getinfo() .

If the archive member is not present, getinfo() raises a KeyError .

Extracting Archived Files From a ZIP Archive ¶

To access the data from an archive member, use the read() method, passing the member’s name.

The data is automatically decompressed for you, if necessary.

Creating New Archives ¶

To create a new archive, simple instantiate the ZipFile with a mode of 'w' . Any existing file is truncated and a new archive is started. To add files, use the write() method.

By default, the contents of the archive are not compressed:

To add compression, the zlib module is required. If zlib is available, you can set the compression mode for individual files or for the archive as a whole using zipfile.ZIP_DEFLATED . The default compression mode is zipfile.ZIP_STORED .

This time the archive member is compressed:

Using Alternate Archive Member Names ¶

It is easy to add a file to an archive using a name other than the original file name, by passing the arcname argument to write() .

There is no sign of the original filename in the archive:

Writing Data from Sources Other Than Files ¶

Sometimes it is necessary to write to a ZIP archive using data that did not come from an existing file. Rather than writing the data to a file, then adding that file to the ZIP archive, you can use the writestr() method to add a string of bytes to the archive directly.

In this case, I used the compress argument to ZipFile to compress the data, since writestr() does not take compress as an argument.

This data did not exist in a file before being added to the ZIP file

Writing with a ZipInfo Instance ¶

Normally, the modification date is computed for you when you add a file or string to the archive. When using writestr() , you can also pass a ZipInfo instance to define the modification date and other meta-data yourself.

In this example, I set the modified time to the current time, compress the data, provide a false value for create_system , and add a comment.

Appending to Files ¶

In addition to creating new archives, it is possible to append to an existing archive or add an archive at the end of an existing file (such as a .exe file for a self-extracting archive). To open a file to append to it, use mode 'a' .

The resulting archive ends up with 2 members:

Python ZIP Archives ¶

Since version 2.3 Python has had the ability to import modules from inside ZIP archives if those archives appear in sys.path . The PyZipFile class can be used to construct a module suitable for use in this way. When you use the extra method writepy() , PyZipFile scans a directory for .py files and adds the corresponding .pyo or .pyc file to the archive. If neither compiled form exists, a .pyc file is created and added.

With the debug attribute of the PyZipFile set to 3, verbose debugging is enabled and you can observe as it compiles each .py file it finds.

Python Basics
Interview Questions
Python Quiz
Popular Packages
Python Projects
Practice Python
AI With Python
Learn Python3
Python Automation
Python Web Dev
DSA with Python
Python OOPs
Dictionaries
Python next() method
Python oct() Function
ord() function in Python
Python pow() Function
Python print() function
Python range() function
Python reversed() Method
round() function in Python
Python slice() function
Python sorted() Function
Python str() function
sum() function in Python
type() function in Python

zip() in Python

Python zip() method takes iterable containers and returns a single iterator object, having mapped values from all the containers.

Python zip() Syntax

It is used to map the similar index of multiple containers so that they can be used just using a single entity.

Syntax : zip(*iterators) Parameters : Python iterables or containers ( list, string etc ) Return Value : Returns a single iterator object.

zip() in Python Examples

Python zip() with lists.

In Python , the zip() function is used to combine two or more lists (or any other iterables) into a single iterable, where elements from corresponding positions are paired together. The resulting iterable contains tuples , where the first element from each list is paired together, the second element from each list is paired together, and so on.

Python zip() with enumerate

The combination of zip() and enumerate() is useful in scenarios where you want to process multiple lists or tuples in parallel, and also need to access their indices for any specific purpose.

Python zip() with Dictionary

The zip() function in Python is used to combine two or more iterable dictionaries into a single iterable, where corresponding elements from the input iterable are paired together as tuples. When using zip() with dictionaries, it pairs the keys and values of the dictionaries based on their position in the dictionary.

Python zip() with Tuple

When used with tuples, zip() works by pairing the elements from tuples based on their positions. The resulting iterable contains tuples where the i-th tuple contains the i-th element from each input tuple.

Python zip() with Multiple Iterables

Python’s zip() function can also be used to combine more than two iterables. It can take multiple iterables as input and return an iterable of tuples, where each tuple contains elements from the corresponding positions of the input iterables.

Zipping lists of unequal size

The zip() function will only iterate over the smallest list passed. If given lists of different lengths, the resulting combination will only be as long as the smallest list passed. In the following code example:

Unzipping Using zip()

Unzipping means converting the zipped values back to the individual self as they were. This is done with the help of “ * ” operator.

Using zip() with Python Loops

There are many possible applications that can be said to be executed using zip, be it student database or scorecard or any other utility that requires mapping of groups. A small example of a scorecard is demonstrated below.

Please Login to comment...

Improve your Coding Skills with Practice

What kind of Experience do you want to share?

A Gentle Introduction to Python's unittest Module

Prerequisites, step 1 — setting up the project directory, step 2 — writing your first test, step 3 — running your tests, step 4 — designing your tests, step 5 — filtering tests, step 6 — providing multiple test cases, step 7 — parametrizing tests using data classes, step 8 — testing exception handling with unittest, step 9 — using fixtures in unittest, step 10 - writing tests with doctests, final thoughts.

Python includes a built-in testing framework called unittest , which allows you to write and run automated tests. This framework, inspired by JUnit , follows an object-oriented approach and includes useful features for writing test cases, test suites, and fixtures.

By the end of this tutorial, you will be able to use unittest to write and run automated tests, helping you catch bugs early and make your software more reliable.

Let's get started!

To follow this tutorial, make sure your machine has the latest version of Python installed and that you have a basic understanding of writing Python programs.

Before you can start writing automated tests with unittest , you need to create a simple application to test. In this section, you'll build a simple yet practical application that converts file sizes from bytes into human-readable formats.

First, create a new directory for the project:

Next, navigate into the newly created directory:

Create a virtual environment to isolate dependencies, prevent conflicts, and avoid polluting your system environment:

Activate the virtual environment:

Once the environment is activated, you will see the virtual environment's name ( venv in this case) prefixed to the command prompt:

Create a src directory to contain the source code:

To ensure Python recognizes the src directory as a package, add an empty __init__.py file:

Next, create and open the formatter.py file within the src directory using the text editor of your choice. This tutorial assumes you are using VSCode, which can be opened with the code command:

Add the following code to the formatter.py file to format file sizes:

The format_file_size() function converts a file size in bytes to a human-readable string format (e.g., KB, MB, GB).

In the root directory, create a main.py file that prompts user input and passes it to the format_file_size() function:

This code takes a file size in bytes from the command line, formats it using the format_file_size() function, and prints the result or displays an error message if the input is invalid or missing.

Let's quickly test the script to ensure it works as intended:

Now that the application works, you will write automated tests for it in the next section.

In this section and the following ones, you'll use unittest to write automated tests that ensure the format_file_size() function works correctly. This includes verifying the proper formatting of various file sizes. Writing these tests will help confirm that your code functions as intended under different scenarios.

To keep the test code well-organized and easily maintainable, you will create a tests directory alongside the source code directory:

This structure helps keep your project tidy and makes locating and running tests easier.

First, create the tests directory with the following command:

Next, create the test_format_file_size.py file in your text editor:

Add the following code to write your first test:

In this example, you define a test case for the format_file_size() function within a subclass of unittest.TestCase named TestFormatFileSize . This follows the unittest framework convention of prefixing the class name with "Test".

The class includes a single test method, test_format_file_size_returns_GB_format() , which checks that the format_file_size() function accurately formats a file size of 1 GB. It uses self.assertEqual() to verify that the function's output is "1.00 GB".

With your test case defined, the next step is to run the test.

To ensure that everything works as expected, you need to run the tests.

You can run the tests using the following command:

This command runs all test cases defined in the test_format_file_size module, where tests is the directory name and test_format_file_size is the test module (Python file) within that directory.

When you execute this command, the output will look similar to the following:

This output indicates that one test was found in the tests/test_format_file_size.py file and passed within 0.000 seconds.

As the number of test files grows, you might want to run all the tests at the same time. To do this, you can use the following command:

The discover command makes the test runner search the current directory and all its subdirectories for test files that start with test_ , executing all found test functions. This approach helps manage and run a large number of tests efficiently, ensuring comprehensive test coverage across your entire codebase.

Now that you understand how unittest behaves when all tests pass, let's see what happens when a test fails. In the test file, modify the test by changing the format_file_size() input to 0 to cause a failure deliberately:

Following that, rerun the tests:

The unittest framework will now show that the test is failing and where it is failing:

The output shows that the test failed, providing a traceback indicating a mismatch between the expected result ("1.00 GB") and the actual result ("0B"). The subsequent summary block provides a concise overview of the failure.

Now that you can run your tests, you are ready to write more comprehensive test cases and ensure your code functions correctly under various scenarios.

In this section, you'll become familiar with conventions and best practices for designing tests with unittest to ensure they are straightforward to maintain.

Setting up test files

One popular convention you've followed is prefixing test files with test_ and placing them in the tests directory. This approach ensures that your tests are easily discoverable by unittest , which looks for files starting with test_ . Organizing tests in a dedicated tests directory keeps your project structure clean and makes it easier to manage and locate test files.

If that works for you, you also have the option to place test files alongside their corresponding source files, a convention common in other languages like Go. For example, the source file example.py and its test file test_example.py can be in the same directory:

Classes and test methods

Another essential convention to follow is naming each test class with a Test prefix (capitalized) and ensuring that all methods within the class are prefixed with test_ . While the underscore is not mandatory, this is a widely adopted convention among Python users:

Naming patterns

The names you give your test files, classes, and methods should describe what they are testing. Descriptive names improve readability and maintainability by clearly indicating the purpose of each test. Generic names like test_method can lead to confusion and make it harder to understand what is being tested, especially as the codebase grows.

Here are some examples of well-named test files, class names, and method names following this convention:

Having clear and specific names produces better test reporting, helping other developers (and future you) quickly grasp the tested functionality, locate specific tests, and maintain the test suite more effectively.

Running all tests can become time-consuming as your application grows and your test suite expands. To improve efficiency, you can filter tests to run only a subset, especially when working on specific features or fixing bugs. This approach provides faster feedback, helps isolate issues, and allows targeted testing.

Here’s how you can add more test methods and run specific subsets of your tests:

Running tests from a specific file

When you have multiple test files and only want to run a specific test file, you can provide the module name:

Running Tests from a specific class

It is common to have more than one class in a test module. To execute only tests from a specific class, you can use:

The output will be the same as in the preceding section since there is only one class:

Running a specific test method of a class

If you want to target a single method in a class, you can specify the method name in the command:

Running this command will execute only the specified method:

Filtering with the -k option

To streamline your testing process, unittest provides the -k command-line option, which allows you to filter and run only the tests that match a specific pattern(a substring). This can be particularly useful when you have an extensive suite of tests and want to run a subset that matches a particular condition or naming convention.

For example, the following command executes only the tests that have the "gb" substring in their names:

This command will execute only the test_format_file_size_returns_format_gb test because it contains the "gb" substring. The output will be:

You can verify this with the -v (verbose) option:

In this case, the verbose output provides additional information about each test executed, making it easier to understand which tests matched the pattern and their results.

Skipping tests with unittest

There are times when you want to skip specific tests, often due to incomplete functionality, unavailable dependencies, or other temporary conditions. The unittest module provides the skip() decorator, which allows you to skip individual test methods or entire test classes.

The skip() decorator marks tests that should not be executed. Here’s how you can use the skip() decorator in unittest :

Save and rerun the tests:

You will see the output confirming that one test was skipped:

The unittest module provides several skip-related decorators to control the execution of tests:

@unittest.skipIf(condition, reason) : Skips a test if the specified condition is True .
@unittest.skipUnless(condition, reason) : Skips a test unless the condition is True .
@unittest.expectedFailure : Marks a test that is expected to fail, allowing it to be run without causing the entire test suite to fail.

Multiple test methods are often similar but differ only in their inputs and expected outputs. Consider the methods you added in the previous section; each function takes the format_file_size() function and tests it with different inputs to verify the correct output. Instead of writing separate test methods for each case, you can use parameterized tests to simplify your code and reduce redundancy.

Since Python 3.4, Python introduced subtests, which consolidate similar tests into a single method. To make use of subtests, rewrite the contents of test_format_file_size.py with the following code:

This code defines a single method to test the format_file_size() function using multiple test cases. Instead of separate test methods, it uses a list of tuples for different inputs and expected outputs, iterating through them with a loop and creating a subtest for each case using self.subTest .

Now you can run the test:

You can also modify the test to cause a failure deliberately:

When you rerun the tests, you will see more details about the failure, including the specific input and expected output:

Now that you can parameterize tests with subtests, you will take it further in the next section.

So far, you have used subtests with a list of tuples to parameterize your tests. While this approach is straightforward, it has a few issues: tuples lack descriptive names, making them less readable and unclear, especially with many parameters; maintaining and updating a growing list of tuples becomes cumbersome; and adding additional metadata or more complex structures to tuples can make them unwieldy and hard to manage.

To address these issues, consider using data classes for a more structured, readable, and scalable way to organize your test parameters. Data classes help in the following ways:

Logical Grouping : They group related test data (input values, expected outputs, and IDs) into a single object, enhancing the readability of your tests.
Default Values : You can set default values for fields, which reduces redundancy if test cases share similar properties or values.

Let's use subtests with data classes by rewriting the code from the previous section:

This code defines a FileSizeTestCase class using the @dataclass decorator, which includes three attributes: size_bytes , expected_output , and id . This data class serves as a blueprint for our test cases, with each test case represented as an instance of this class.

Within the TestFormatFileSize class, the test_format_file_size method iterates through a list of FileSizeTestCase instances. Each instance contains the input size and the expected output for the format_file_size function. The self.subTest context manager runs each test case independently, providing clearer test output and making it easier to identify and debug specific failures.

Now, rerun the tests with:

The tests will pass without any issues:

With this change, your tests are now more structured and maintainable, making managing and understanding the test cases more manageable.

Exception handling is essential and needs to be tested to ensure exceptions are raised under the correct conditions.

For instance, the format_file_size() function raises a ValueError if it receives a negative integer as input:

You can test if the ValueError exception is raised using assertRaises() :

The unittest framework's assertRaises context manager checks that a ValueError is raised with the appropriate message when format_file_size() is called with a negative size. The str(context.exception) extracts the exception message for comparison.

To integrate this into the parameterized tests, you can do it as follows:

This code added two additional fields to the FileSizeTestCase class: expected_error and error_message . These fields indicate if an error is expected and what the error message should be.

A new test case with an input of -1 is included to trigger the error, with the expected_error and error_message fields set accordingly.

In the test_format_file_size() method, the code checks for the expected exception using self.assertRaises() . If an error is expected, it verifies the type and message of the exception. If no error is expected, self.assertEqual() ensures that the function's output matches the expected result.

When you save and run the tests, you will see that the tests pass, confirming that the ValueError exception was raised:

For error messages that may vary slightly, you can use regular expressions with unittest.assertRaises() :

With this in place, you can efficiently verify that your code raises the expected exceptions under various conditions.

Now that you can use unittest to write, organize, and execute tests, we will explore how to use fixtures. Fixtures are helper functions that set up the necessary preconditions and clean up after tests. They are essential for writing efficient and maintainable tests, as they allow you to share setup code across multiple tests, reducing redundancy and improving code clarity.

The topic of fixtures is extensive and best served with its article, but here, we will provide a concise introduction.

To begin with fixtures, create a separate test file named test_fixtures.py :

Add the following code:

The setUp method creates a fixture that is automatically called before each test method, ensuring consistent setup. The setUp method initializes the welcome_message attribute, making it available to all test methods in the class. The test_welcome_message method then verifies that this fixture returns the correct message by using self.assertEqual to check if the value of self.welcome_message matches the expected string.

With the fixture in place, run the following command to execute the tests:

The output will look similar to the following:

For a more practical example that uses fixtures to set up a database, consider the following code using SQLite :

The setUp method creates an in-memory SQLite database connection and initializes the users table, ensuring a fresh setup before each test. The tearDown method closes the database connection after each test, maintaining a clean test environment. Additionally, the create_user and update_email methods allow for the insertion and updating of records within the test methods, facilitating database manipulation during tests.

With the fixture in place, you can add tests to verify if creating and updating users works correctly:

The test_create_user() function validates adding new users to the database. It creates a user with a specific username and email address, then confirms the user's existence and corrects the email in the database through queries. Conversely, the test_update_email() function verifies the ability to update a user's email address. It first creates a user entry, updates the user's email, and then checks the database to ensure the email has been correctly updated.

Now run the tests with the following command:

Now that you are familiar with fixtures, you can create more complex and maintainable tests.

In Python, docstrings are string literals that appear after defining a class, method, or function. They are used to document your code. A neat feature of Python is that you can add tests to the docstrings, and Python will execute them for you.

Take the following example:

In this example, the add() function calculates the sum of two numbers, a and b . The function includes a comprehensive docstring that describes its purpose ("Returns the sum of a and b") and provides an example usage in the doctest format.

This example demonstrates how to call the function with arguments 2 and 3 , expecting a result of 5 . Docstrings with embedded doctests serve a dual purpose:

They document the code.
They provide a way to verify its correctness automatically.

When you include a usage example in the docstring, you can use the doctest module to run these embedded tests and ensure the function behaves as expected.

Let's apply the doctest to the application code in formatter.py :

The docstring for the format_file_size() function describes its purpose, input, output, and potential exceptions. It details that the function converts bytes to a human-readable format, takes an integer size_bytes as input, returns a formatted string, and raises a ValueError for negative inputs. The examples provided show typical usage and expected results.

To see if the doctest examples pass, enter the following command:

When you run the file, you will see the following output:

The output shows that the format_file_size function passed all tests. Specifically, it correctly returned '0B' for 0 , '1.00 KB' for 1024 , and '1.00 MB' for 1048576 . All 3 tests in the function passed, with no failures.

This article walked you through writing, organizing, and executing unit tests with Python's built-in unittest framework. It also explored features such as subtests and fixtures to help you create efficient and maintainable tests.

To continue learning more about unittest , see the official documentation for more details. unittest is not the only testing framework available for Python; Pytest is another popular testing framework that offers a more concise syntax and additional features. To explore Pytest, see our Pytest documentation guide .

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Playwright Testing Essentials
End-to-End Testing with Playwright
9 Playwright Best Practices
E2E tests for Signup/Login
Automating User Signup/Login Tests
Playwright vs Cypress vs Puppeteer vs Selenium
Unit Testing in Laravel
Unit Testing in Go
Intermediate Unit Testing in Go
Comparing Node.js Testing Libraries
Node.js Test Runner
Load Testing with Artillery
Unit Testing with Pytest
A Complete Guide to Pytest Fixtures
Python Unit Testing with Unittest

Make your mark

Join the writer's program.

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github

How to Setup a Project in Snowpark Using a Python IDE

Snowpark, offered by the Snowflake AI Data Cloud , consists of libraries and runtimes that enable secure deployment and processing of non-SQL code, such as Python, Java, and Scala. With Snowpark , you can develop code outside of the Snowflake environment and then deploy it back into Snowflake, eliminating the need to manage infrastructure concerns.

In this blog, we’ll cover the steps to get started, including:

How to set up an existing Snowpark project on your local system using a Python IDE.

Add a Python UDF to the existing codebase and deploy the function directly in Snowflake.

Validate the function deployment locally and test from Snowflake as well.

Dive deep into the inner workings of the Snowpark Python UDF deployment.

Check in the new code to Github using git workflows and validate the deployment.

Before we dive into the steps, it’s important to unpack why Snowpark is such a big deal.

What Does Snowpark Bring to the Table?

Familiar Client Side Libraries – Snowpark brings in-house and deeply integrated, DataFrame-style programming abstract and OSS-compatible APIs to the languages data practitioners like to use (Python, Scala, etc). It also includes the Snowpark ML API for more efficient machine language (ML) modeling and ML operations.

Flexible Runtime Constructs – Snowpark provides flexible runtime constructs that allow users to bring in and run custom logic. Developers can seamlessly build data pipelines, ML models, and data applications with User-Defined Functions and Stored Procedures.

A simple diagram highlighting Snowpark's capabilities

What Are Snowpark’s Differentiators?

Spark/Pandas Dataframe Style Programming Within Snowflake – You can perform your standard actions/transformations just like you would in Spark.

Easy to Collaborate – With multiple developers working on the programming side of things, Snowpark solves an otherwise collaboration issue efficiently. You can set up your own environment in your local system and then check in/deploy the code back to Snowflake using Snowpark (more on this later in the article).

Built-In Server-Side Packages – NumPy, Pandas, Keras, etc.

Easy to Deploy – Using SnowCLI, Snowpark can easily integrate your VS Code (or any editor for that matter) with Snowflake and leverage a virtual warehouse defined to run your deployment process (we’ll also cover this later in the article).

A diagram showing Snowpark's differentiating features

How to Set up a Project in Snowpark

Now with the background information out of the way, let’s get started! The first thing you’ll want to ensure is that you have all of the following:

Python (preferably between 3.8 – 3.10 but not greater than 3.10 ).

Miniconda (if you already have Anaconda, please skip).

Snowflake account with ACCOUNTADMIN access (to alter the configuration file).

Anaconda Terms & Conditions are accepted on Snowflake. See the Getting Started section in Third-Party Packages .

GitHub account.

Gitbash installed.

VS Code (Or any similar IDE to work with).

SnowSQL installed on your system.

Note: These instructions are meant to work for Windows, but are very similar for people working on Mac and Linux operating systems.

Getting Started

Start with forking this and creating your own repo on GitHub.

Add Miniconda scripts path to your environment variables system path (similar to C:\Users\kbasu\AppData\Local\miniconda3\Scripts ).

Create a root folder and point the VS Code to that directory using the command prompt terminal.

Clone your forked repository to the root directory. ( git clone <your project repo> ).

Move inside sfguide-data-engineering-with-snowpark-python ( cd sfguide-data-engineering-with-snowpark-python ).

Create conda environment ( conda env create -f environment.yml ). You should find environment.yml already present inside the root folder.

Activate the conda environment. ( conda activate snowflake-demo ).

Now you should see the new environment activated!

Here is what the project directory should look like:

Screenshot of the project's directory file tree

How to Add a New Python UDF and Deploy the Same to Snowflake

Typically you’d want to create a separate folder structure for functions so that they can be deployed as is (no dependency):

An app.py – This is the main function where you will code. The file name can be different, but for standardization purposes, it is advised to use app.py.

An app.toml – This will help in deploying the code (we’ll share the formats for each of these files in the end).

A requirement.txt – Like any other Python project, this will be looked upon to find and install any specific Python library required for this particular functionality (for this particular UDF, it is kept blank as no additional library is required).

A .gitignore – This is automatically updated during git push and pull.

Screenshot of a file tree view of the separate folder and its contents

Main function – A simple multiplication function:

Screenshot of the main python function, showing some of its code

TOML file – App.zip will be automatically created while deploying the code, will come to that shortly. The rest should be self-explanatory.

Screenshot of the app.toml file contents

Requirement file – Leave blank.

Screenshot of the requirements.txt file contents - showing a blank file.

Git ignore file

Install the additional libraries (only for this new functionality) if required. We don’t have any in our example, but for reference, this is how you’d perform this:

Let’s test this function locally once (we need to be confident about the functionality of the newly added piece of code).

Now it’s time to deploy the code to Snowflake:

Screenshot of the command line showing the command used to deploy to Snowflake, along with the returned output.

The deployment should be successful. In the next sections, we will review the deployment’s validation and the inner workings of the entire process.

Validating the Deployment in Snowflake

Existence – The newly created Python UDF should be present under the Analytics schema under the HOL_DB database.

Screenshot of the HOL_DB tree view in Snowflake

Content – Let’s validate the content. SnowCLI simplifies the process and automatically converts the Python code to SQL script (more on this in the next section).

Screenshot of the details view after selecting one of the items from the tree view.

Run the Function From Snowflake – Let’s test the functionality from Snowflake.

Screenshot of a test query involving the function, being ran from Snowflake

Test Locally – Now that we have tested the functionality from Snowflake, let’s test this from local.

Screenshot of the command line showing the command being ran for local testing the function, along with its returned output.

In our example, the deployment worked just fine.

How Does the Deployment Work in the Background?

For Snowpark Python UDFs and sprocs in particular, the SnowCLI does all the heavy lifting of deploying the objects to Snowflake. Here is what it does in the background:

Dealing with third-party packages

For packages that can be accessed directly from our Anaconda environment, it will add them to the packages list in the create function SQL command.

For packages that are not currently available in our Anaconda environment, it will download the code and include them in the project zip file.

Creates a zip file of everything in your project, including the following:

Screenshot of a file tree view of the function's folder, with an "app.zip" file being selected

Copying that project zip file to the Snowflake stage:

Screenshot of the Deployments view in Snowflake, showing the "app.zip" file uploaded

Creating the Snowflake function object:

Screenshot of the HOL_DB tree view, showing a newly added item under the Functions folder

More on SnowCLI can be found here since it is still being developed by Snowflake. For a comprehensive guide to writing Python UDF, check out this guide.

Next, we will deploy the changes back to git.

Deployment of the Changes to GIT Repo

Configure Forked Repo – From the repository, follow this route – Settings > Secrets and variables > Actions > New repository secret near the top right and enter the names given below along with the appropriate values:

Screenshot of the Repository secrets view

Configure SnowSQL Parameters: From the VS Code editor, press CTRL-P. Search for ~/.snowsql/config .

Screenshot of the contents of the snowsql config file

Configure similar to the above settings.

Push Your Changes and Commit

You should already see pending changes in your VS Code git source control section.

Screenshot of the Source Control window in the IDE, showing the staged commit and its changes.

Enter a suitable message and commit.

If not added already, add your username and email ID as authorized credentials for git:

Screenshot of the command line showing the commands for configuring username and email ID credentials for git

You can commit and then sync, or there’s an option to commit and sync.

Verify the success message and no pending changes in the source control section.

Verify the Changes on git

Go to the actions tab in GitHub and find the latest git workflow run.

Screenshot of GitHub, showing the Actions tab

Click on the latest run and verify the details.

Screenshot of the git history, viewed from Snowflake

In the latest codebase, verify that the latest changes are present.

Screenshot of GitHub, viewing the repository

Best Practices

Now you may ask, when is it the right time to go via the Snowpark route? Here are some of our best practices to follow:

If you have too few Python (or any other non-SQL programming language) UDFs to write/already written, you may want to go via the Python worksheet route instead.

If your data pipeline requirements are quite straightforward—i.e., they don’t have too many different conditional logics, too many different types of workloads, or changing requirements—and pipelines can be written in plain SQL without any foreseeing debugging issues, then you may not want to over-engineer and stick to traditional SQL-based Snowflake pipelines.

If you have a simple migration requirement (e.g., Hive tables must be migrated to Snowflake ), check if you can use Snowflake-Spark connectors directly instead of going via the Snowpark route.

Snowpark will have the greatest impact on the following use cases:

You have migration requirements with data cleansing/transformation/standardization logic written/not written in Spark.

You have different types of workloads to handle with varying requirements.

You have different developers working on building data pipelines/UDFs/stored procedures in the same environment.

You have code written in different languages(Java/Python etc.), and you want to have a common platform without having to worry about infrastructure considerations.

You already have an SQL-based data pipeline, but changing it to meet new requirements would require extensive changes.

You want a Spark-like programming environment.

Thank you so much for reading! Our hope is that this article helps you get the most out of Snowpark. If you need help, have questions, or want to learn more about how Snowpark & Snowflake can help your business make more informed decisions, the experts at phData can help!

As Snowflake’s 2024 Partner of the Year , phData can confidently help you get the most out of your Snowflake investment. Explore our Snowflake services today, especially our Snowpark MVP program.

When you use the Snowflake Python connector, you are fetching the data from Snowflake and bringing it to the compute instance(the public cloud you are using- AWS/Azure/GCP) where your Python code is running for further processing. In Snowpark, your Python (Or Scala/Java) program(for data processing), which is coded as a UDF runs on the snowflake engine(using SnowCLI connecting to a virtual warehouse) itself. So, you do not need to bring the data out of Snowflake for processing, thus making it a better choice if you’re concerned about security. In addition to this, there are the following benefits of using Snowpark instead-

Support for interacting with data within Snowflake using libraries and patterns purpose built for different languages without compromising on performance or functionality.

Support for authoring Snowpark code using local tools such as Jupyter, VS Code, or IntelliJ.

Support for pushdown for all operations, including Snowflake UDFs. This means Snowpark pushes down all data transformation and heavy lifting to the Snowflake data cloud, enabling you to efficiently work with data of any size.

By now, you must already know that Snowpark requires the users/developers to be comfortable with at least 1 programming language. Here are some additional points-

Snowpark is a relatively new technology, and some bugs or performance issues may not yet have been identified.

Snowpark is a programming model, and it requires some level of programming expertise to use.

Snowpark is not currently available for all Snowflake regions.

There are some limitations around stored procedure in writing .

More to explore

Upcoming Snowflake Features

Retail & CPG Questions phData Can Answer with Data

Top Announcements for Data Scientists at Snowflake Data Cloud Summit 2024

Join our team

About phData
Leadership Team
All Technology Partners
Case Studies
phData Toolkit

Subscribe to our newsletter

© 2024 phData
Privacy Policy
Accessibility Policy
Website Terms of Use
Data Processing Agreement
End User License Agreement

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Data Coach Overview
Course Collection

Accelerate and automate your data projects with the phData Toolkit

Get Started
Financial Services
Manufacturing
Retail and CPG
Healthcare and Life Sciences
Call Center Analytics Services
Snowflake Native Streaming of HL7 Data
Snowflake Retail & CPG Supply Chain Forecasting
Snowflake Plant Intelligence For Manufacturing
Snowflake Demand Forecasting For Manufacturing
Snowflake Data Collaboration For Manufacturing

MLOps Framework
Teradata to Snowflake
Cloudera CDP Migration

Technology Partners

Other technology partners.

Check out our latest insights

Dashboard Library
Whitepapers and eBooks

Data Engineering

Consulting, migrations, data pipelines, dataops, change management, enablement & learning, coe, coaching, pmo, data science and machine learning services, mlops enablement, prototyping, model development and deployment, strategy services, data, analytics, and ai strategy, architecture and assessments, reporting, analytics, and visualization services, self-service, integrated analytics, dashboards, automation, elastic operations, data platforms, data pipelines, and machine learning.

Python ZIP fájl példával

A Python lehetővé teszi a zip/tar gyors létrehozását archilátod.

Following parancs a teljes könyvtárat tömöríti

Following parancs segítségével szabályozhatja a kívánt fájlokat archive

Íme a lépések a ZIP-fájl létrehozásához a Pythonban

Step 1) Annak létrehozásához archive fájlt Pythonból, győződjön meg arról, hogy az importálási nyilatkozat helyes és rendben van. Itt az import nyilatkozat a archive is from shutil import make_archive

Kód Magyarázat

Import gyártmány_archive osztály a shutil modulból
Használja a split funkciót a könyvtár és a fájlnév felosztására a szövegfájl helyének elérési útjától (guru99)
Ezután a modult „shutil.make_”archive ("guru99 archive, „zip”, root_dir)” létrehozásához archive fájlt, amely zip formátumban lesz
Ezután átadjuk a gyökérkönyvtárba azokat a dolgokat, amelyeket tömöríteni szeretnénk. Így a könyvtárban minden zip-be kerül
A kód futtatásakor láthatja a archive zip fájl jön létre a panel jobb oldalán.

Step 2) Miután a archive fájl elkészült, kattintson a jobb gombbal a fájlra, és válassza ki az operációs rendszert, és megjelenik az Ön archive fájlokat az alábbiak szerint

Most a archiA ve.zip fájl megjelenik az operációs rendszerén (Windows Explorer)

Step 3) Amikor double-kattintson a fájlra, látni fogja az ott lévő összes fájl listáját.

Step 4) Pythonban jobban irányíthatjuk archive mivel meg tudjuk határozni, hogy melyik fájlba kerüljön bele archive. A mi esetünkben két fájlt mellékelünk alá archive "guru99.txt" és a „guru99.txt.bak”.

Zipfile osztály importálása a zip fájlból Python modulból. Ez a modul teljes ellenőrzést biztosít a zip fájlok létrehozása felett
Létrehozunk egy új Zip-fájlt névvel ("testguru99.zip, "w)
Új Zipfile osztály létrehozásához engedély átadása szükséges, mivel ez egy fájl, ezért az információkat newzip-ként kell beírnia a fájlba
A „newzip” változót használtuk az általunk létrehozott zip fájlra
A „newzip” változó írási funkciójával hozzáadjuk a „guru99.txt” és „guru99.txt.bak” fájlokat a archive

Amikor végrehajtja a kódot, láthatja, hogy a panel jobb oldalán létrejön a fájl „guru99.zip” néven.

Megjegyzések : Itt nem adunk parancsot a fájl „bezárására”, mint például a „newzip.close”, mert „With” hatókör zárolást használunk, így ha a program ezen a hatókörön kívül esik, a fájl megtisztul és automatikusan bezárul.

Step 5) Amikor -> kattintson jobb gombbal a fájlra (testguru99.zip), és -> válassza ki az operációs rendszert (Windows Explorer) , megmutatja a archive fájlokat a mappában az alábbiak szerint.

Amikor double kattintson a „testguru99.zip” fájlra, megnyílik egy másik ablak, és ez mutatja a benne található fájlokat.

Itt a teljes kód

Python 2 példa

Python 3 példa

Összegzésként

A teljes könyvtár tömörítéséhez használja a „shutil.make_” parancsotarchive("név","zip", gyökérkönyvtár)
A tömörítendő fájlok kiválasztásához használja a „ZipFile.write(fájlnév)” parancsot.
Online Python fordító (szerkesztő / tolmács / IDE) a kód futtatásához
PyUnit oktatóanyag: Python Unit Testing Framework (példával)
Hogyan kell telepíteni a Python-t Windows [Pycharm IDE]
Hello World: Hozd létre az első Python-programodat
Python-változók: karakterlánc-változótípusok meghatározása/deklarálása
Python-karakterláncok: csere, csatlakozás, felosztás, fordított, nagy- és kisbetűk
Python TUPLE – Csomagolás, Kicsomagolás, Összehasonlítás, Szeletelés, Törlés, Kulcs
Szótár Pythonban szintaxissal és példával

Python处理压缩文件的终极指南

在日常数据处理和文件管理中，压缩文件是一种常见的文件格式。使用Python可以方便地自动化处理压缩文件，包括压缩和解压各种格式的文件，如ZIP、TAR、GZ等。本文将详细介绍如何使用Python处理这些压缩文件，涵盖基本操作、常用库及其应用场景，并提供相应的示例代码。

为什么要使用Python处理压缩文件
自动化处理：可以编写脚本自动化完成压缩和解压任务，减少手动操作，提高工作效率。
跨平台：Python具有良好的跨平台兼容性，可以在不同操作系统上处理压缩文件。
丰富的库支持：Python有多个强大的库支持处理各种压缩文件格式，如 zipfile 、 tarfile 、 shutil 等。

使用 zipfile 模块处理 ZIP 文件

zipfile 模块是Python内置的用于处理ZIP文件的模块，支持创建、读取、写入和解压ZIP文件。

使用 zipfile 模块可以方便地读取ZIP文件中的内容。

可以使用 zipfile 模块创建新的ZIP文件，并向其中添加文件。

向现有 ZIP 文件添加文件

可以使用 zipfile 模块向现有的ZIP文件中添加文件。

使用 tarfile 模块处理 TAR 文件

tarfile 模块是Python内置的用于处理TAR文件的模块，支持创建、读取、写入和解压TAR文件。

使用 tarfile 模块可以方便地读取TAR文件中的内容。

可以使用 tarfile 模块创建新的TAR文件，并向其中添加文件。

向现有 TAR 文件添加文件

可以使用 tarfile 模块向现有的TAR文件中添加文件。

使用 shutil 模块处理压缩文件

shutil 模块提供了高级的文件操作功能，包括对压缩文件的处理，支持创建和解压ZIP和TAR格式的文件。

使用 shutil 模块可以方便地创建压缩文件。

使用 shutil 模块可以方便地解压压缩文件。

下面是一个自动备份文件夹的示例，使用 zipfile 模块将指定文件夹压缩为ZIP文件，并保存到指定位置。

下面是一个自动解压ZIP文件并处理其中文件的示例，解压后对每个文件进行简单处理（如打印文件内容）。

本文详细介绍了如何使用Python自动化处理压缩文件，包括读取、创建、添加和解压ZIP和TAR文件。通过使用Python内置的 zipfile 、 tarfile 和 shutil 模块，开发者可以高效地管理压缩文件，实现自动化文件处理。文中提供了丰富的示例代码，展示了如何在实际应用中使用这些模块进行文件备份和解压操作。掌握这些技术，不仅可以提高工作效率，还能简化日常文件管理任务。

如果你觉得文章还不错，请大家点赞、分享、留言下，因为这将是我持续输出更多优质文章的最强动力！

本文分享自日常学python 微信公众号，前往查看

如有侵权，请联系 [email protected] 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

深圳市腾讯计算机系统有限公司 ICP备案/许可证号：粤B2-20090059 深公网安备号 44030502008569

腾讯云计算（北京）有限责任公司京ICP证150476号 | 京ICP备11018762号 | 京公网安备号11010802020287

Navigazione

successivo |
precedente |
Python »
3.14.0a0 Documentation »
The Python Standard Library »
Data Persistence »
sqlite3 — DB-API 2.0 interface for SQLite databases
Theme Auto Light Dark |

sqlite3 — DB-API 2.0 interface for SQLite databases ¶

Source code: Lib/sqlite3/

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. Some applications can use SQLite for internal data storage. It’s also possible to prototype an application using SQLite and then port the code to a larger database such as PostgreSQL or Oracle.

The sqlite3 module was written by Gerhard Häring. It provides an SQL interface compliant with the DB-API 2.0 specification described by PEP 249 , and requires SQLite 3.15.2 or newer.

This document includes four main sections:

Tutorial teaches how to use the sqlite3 module.

Reference describes the classes and functions this module defines.

How-to guides details how to handle specific tasks.

Explanation provides in-depth background on transaction control.

The SQLite web page; the documentation describes the syntax and the available data types for the supported SQL dialect.

Tutorial, reference and examples for learning SQL syntax.

PEP written by Marc-André Lemburg.

In this tutorial, you will create a database of Monty Python movies using basic sqlite3 functionality. It assumes a fundamental understanding of database concepts, including cursors and transactions .

First, we need to create a new database and open a database connection to allow sqlite3 to work with it. Call sqlite3.connect() to create a connection to the database tutorial.db in the current working directory, implicitly creating it if it does not exist:

The returned Connection object con represents the connection to the on-disk database.

In order to execute SQL statements and fetch results from SQL queries, we will need to use a database cursor. Call con.cursor() to create the Cursor :

Now that we’ve got a database connection and a cursor, we can create a database table movie with columns for title, release year, and review score. For simplicity, we can just use column names in the table declaration – thanks to the flexible typing feature of SQLite, specifying the data types is optional. Execute the CREATE TABLE statement by calling cur.execute(...) :

We can verify that the new table has been created by querying the sqlite_master table built-in to SQLite, which should now contain an entry for the movie table definition (see The Schema Table for details). Execute that query by calling cur.execute(...) , assign the result to res , and call res.fetchone() to fetch the resulting row:

We can see that the table has been created, as the query returns a tuple containing the table’s name. If we query sqlite_master for a non-existent table spam , res.fetchone() will return None :

Now, add two rows of data supplied as SQL literals by executing an INSERT statement, once again by calling cur.execute(...) :

The INSERT statement implicitly opens a transaction, which needs to be committed before changes are saved in the database (see Transaction control for details). Call con.commit() on the connection object to commit the transaction:

We can verify that the data was inserted correctly by executing a SELECT query. Use the now-familiar cur.execute(...) to assign the result to res , and call res.fetchall() to return all resulting rows:

The result is a list of two tuple s, one per row, each containing that row’s score value.

Now, insert three more rows by calling cur.executemany(...) :

Notice that ? placeholders are used to bind data to the query. Always use placeholders instead of string formatting to bind Python values to SQL statements, to avoid SQL injection attacks (see How to use placeholders to bind values in SQL queries for more details).

We can verify that the new rows were inserted by executing a SELECT query, this time iterating over the results of the query:

Each row is a two-item tuple of (year, title) , matching the columns selected in the query.

Finally, verify that the database has been written to disk by calling con.close() to close the existing connection, opening a new one, creating a new cursor, then querying the database:

You’ve now created an SQLite database using the sqlite3 module, inserted data and retrieved values from it in multiple ways.

How-to guides for further reading:

How to use placeholders to bind values in SQL queries

How to adapt custom Python types to SQLite values

How to convert SQLite values to custom Python types
How to use the connection context manager
How to create and use row factories

Explanation for in-depth background on transaction control.

Reference ¶

Module functions ¶.

Open a connection to an SQLite database.

database ( path-like object ) – The path to the database file to be opened. You can pass ":memory:" to create an SQLite database existing only in memory , and open a connection to it.

timeout ( float ) – How many seconds the connection should wait before raising an OperationalError when a table is locked. If another connection opens a transaction to modify a table, that table will be locked until the transaction is committed. Default five seconds.

detect_types ( int ) – Control whether and how data types not natively supported by SQLite are looked up to be converted to Python types, using the converters registered with register_converter() . Set it to any combination (using | , bitwise or) of PARSE_DECLTYPES and PARSE_COLNAMES to enable this. Column names takes precedence over declared types if both flags are set. Types cannot be detected for generated fields (for example max(data) ), even when the detect_types parameter is set; str will be returned instead. By default ( 0 ), type detection is disabled.

isolation_level ( str | None ) – Control legacy transaction handling behaviour. See Connection.isolation_level and Transaction control via the isolation_level attribute for more information. Can be "DEFERRED" (default), "EXCLUSIVE" or "IMMEDIATE" ; or None to disable opening transactions implicitly. Has no effect unless Connection.autocommit is set to LEGACY_TRANSACTION_CONTROL (the default).

check_same_thread ( bool ) – If True (default), ProgrammingError will be raised if the database connection is used by a thread other than the one that created it. If False , the connection may be accessed in multiple threads; write operations may need to be serialized by the user to avoid data corruption. See threadsafety for more information.

factory ( Connection ) – A custom subclass of Connection to create the connection with, if not the default Connection class.

cached_statements ( int ) – The number of statements that sqlite3 should internally cache for this connection, to avoid parsing overhead. By default, 128 statements.

uri ( bool ) – If set to True , database is interpreted as a URI with a file path and an optional query string. The scheme part must be "file:" , and the path can be relative or absolute. The query string allows passing parameters to SQLite, enabling various How to work with SQLite URIs .

autocommit ( bool ) – Control PEP 249 transaction handling behaviour. See Connection.autocommit and Transaction control via the autocommit attribute for more information. autocommit currently defaults to LEGACY_TRANSACTION_CONTROL . The default will change to False in a future Python release.

Raises an auditing event sqlite3.connect with argument database .

Raises an auditing event sqlite3.connect/handle with argument connection_handle .

Cambiato nella versione 3.4: Added the uri parameter.

Cambiato nella versione 3.7: database can now also be a path-like object , not only a string.

Cambiato nella versione 3.10: Added the sqlite3.connect/handle auditing event.

Cambiato nella versione 3.12: Added the autocommit parameter.

Cambiato nella versione 3.13: Positional use of the parameters timeout , detect_types , isolation_level , check_same_thread , factory , cached_statements , and uri is deprecated. They will become keyword-only parameters in Python 3.15.

Return True if the string statement appears to contain one or more complete SQL statements. No syntactic verification or parsing of any kind is performed, other than checking that there are no unclosed string literals and the statement is terminated by a semicolon.

For example:

This function may be useful during command-line input to determine if the entered text seems to form a complete SQL statement, or if additional input is needed before calling execute() .

See runsource() in Lib/sqlite3/__main__.py for real-world use.

Enable or disable callback tracebacks. By default you will not get any tracebacks in user-defined functions, aggregates, converters, authorizer callbacks etc. If you want to debug them, you can call this function with flag set to True . Afterwards, you will get tracebacks from callbacks on sys.stderr . Use False to disable the feature again.

Errors in user-defined function callbacks are logged as unraisable exceptions. Use an unraisable hook handler for introspection of the failed callback.

Register an adapter callable to adapt the Python type type into an SQLite type. The adapter is called with a Python object of type type as its sole argument, and must return a value of a type that SQLite natively understands .

Register the converter callable to convert SQLite objects of type typename into a Python object of a specific type. The converter is invoked for all SQLite values of type typename ; it is passed a bytes object and should return an object of the desired Python type. Consult the parameter detect_types of connect() for information regarding how type detection works.

Note: typename and the name of the type in your query are matched case-insensitively.

Module constants ¶

Set autocommit to this constant to select old style (pre-Python 3.12) transaction control behaviour. See Transaction control via the isolation_level attribute for more information.

Pass this flag value to the detect_types parameter of connect() to look up a converter function by using the type name, parsed from the query column name, as the converter dictionary key. The type name must be wrapped in square brackets ( [] ).

This flag may be combined with PARSE_DECLTYPES using the | (bitwise or) operator.

Pass this flag value to the detect_types parameter of connect() to look up a converter function using the declared types for each column. The types are declared when the database table is created. sqlite3 will look up a converter function using the first word of the declared type as the converter dictionary key. For example:

This flag may be combined with PARSE_COLNAMES using the | (bitwise or) operator.

Flags that should be returned by the authorizer_callback callable passed to Connection.set_authorizer() , to indicate whether:

Access is allowed ( SQLITE_OK ),

The SQL statement should be aborted with an error ( SQLITE_DENY )

The column should be treated as a NULL value ( SQLITE_IGNORE )

String constant stating the supported DB-API level. Required by the DB-API. Hard-coded to "2.0" .

String constant stating the type of parameter marker formatting expected by the sqlite3 module. Required by the DB-API. Hard-coded to "qmark" .

The named DB-API parameter style is also supported.

Version number of the runtime SQLite library as a string .

Version number of the runtime SQLite library as a tuple of integers .

Integer constant required by the DB-API 2.0, stating the level of thread safety the sqlite3 module supports. This attribute is set based on the default threading mode the underlying SQLite library is compiled with. The SQLite threading modes are:

Single-thread : In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once.

Multi-thread : In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads.

Serialized : In serialized mode, SQLite can be safely used by multiple threads with no restriction.

The mappings from SQLite threading modes to DB-API 2.0 threadsafety levels are as follows:

SQLite threading mode			DB-API 2.0 meaning
single-thread	0	0	Threads may not share the module
multi-thread	1	2	Threads may share the module, but not connections
serialized	3	1	Threads may share the module, connections and cursors

Cambiato nella versione 3.11: Set threadsafety dynamically instead of hard-coding it to 1 .

These constants are used for the Connection.setconfig() and getconfig() methods.

The availability of these constants varies depending on the version of SQLite Python was compiled with.

Added in version 3.12.

SQLite docs: Database Connection Configuration Options

Deprecato dalla versione 3.12, rimosso nella versione 3.14.: The version and version_info constants.

Connection objects ¶

Each open SQLite database is represented by a Connection object, which is created using sqlite3.connect() . Their main purpose is creating Cursor objects, and Transaction control .

How to use connection shortcut methods

Cambiato nella versione 3.13: A ResourceWarning is emitted if close() is not called before a Connection object is deleted.

An SQLite database connection has the following attributes and methods:

Create and return a Cursor object. The cursor method accepts a single optional parameter factory . If supplied, this must be a callable returning an instance of Cursor or its subclasses.

Open a Blob handle to an existing BLOB .

table ( str ) – The name of the table where the blob is located.

column ( str ) – The name of the column where the blob is located.

row ( str ) – The name of the row where the blob is located.

readonly ( bool ) – Set to True if the blob should be opened without write permissions. Defaults to False .

name ( str ) – The name of the database where the blob is located. Defaults to "main" .

OperationalError – When trying to open a blob in a WITHOUT ROWID table.

The blob size cannot be changed using the Blob class. Use the SQL function zeroblob to create a blob with a fixed size.

Added in version 3.11.

Commit any pending transaction to the database. If autocommit is True , or there is no open transaction, this method does nothing. If autocommit is False , a new transaction is implicitly opened if a pending transaction was committed by this method.

Roll back to the start of any pending transaction. If autocommit is True , or there is no open transaction, this method does nothing. If autocommit is False , a new transaction is implicitly opened if a pending transaction was rolled back by this method.

Close the database connection. If autocommit is False , any pending transaction is implicitly rolled back. If autocommit is True or LEGACY_TRANSACTION_CONTROL , no implicit transaction control is executed. Make sure to commit() before closing to avoid losing pending changes.

Create a new Cursor object and call execute() on it with the given sql and parameters . Return the new cursor object.

Create a new Cursor object and call executemany() on it with the given sql and parameters . Return the new cursor object.

Create a new Cursor object and call executescript() on it with the given sql_script . Return the new cursor object.

Create or remove a user-defined SQL function.

name ( str ) – The name of the SQL function.

narg ( int ) – The number of arguments the SQL function can accept. If -1 , it may take any number of arguments.

func ( callback | None) – A callable that is called when the SQL function is invoked. The callable must return a type natively supported by SQLite . Set to None to remove an existing SQL function.

deterministic ( bool ) – If True , the created SQL function is marked as deterministic , which allows SQLite to perform additional optimizations.

Cambiato nella versione 3.8: Added the deterministic parameter.

Cambiato nella versione 3.13: Passing name , narg , and func as keyword arguments is deprecated. These parameters will become positional-only in Python 3.15.

Create or remove a user-defined SQL aggregate function.

name ( str ) – The name of the SQL aggregate function.

n_arg ( int ) – The number of arguments the SQL aggregate function can accept. If -1 , it may take any number of arguments.

A class must implement the following methods:

step() : Add a row to the aggregate.

finalize() : Return the final result of the aggregate as a type natively supported by SQLite .

The number of arguments that the step() method must accept is controlled by n_arg .

Set to None to remove an existing SQL aggregate function.

Cambiato nella versione 3.13: Passing name , n_arg , and aggregate_class as keyword arguments is deprecated. These parameters will become positional-only in Python 3.15.

Create or remove a user-defined aggregate window function.

name ( str ) – The name of the SQL aggregate window function to create or remove.

num_params ( int ) – The number of arguments the SQL aggregate window function can accept. If -1 , it may take any number of arguments.

A class that must implement the following methods:

step() : Add a row to the current window.

value() : Return the current value of the aggregate.

inverse() : Remove a row from the current window.

The number of arguments that the step() and value() methods must accept is controlled by num_params .

Set to None to remove an existing SQL aggregate window function.

NotSupportedError – If used with a version of SQLite older than 3.25.0, which does not support aggregate window functions.

Create a collation named name using the collating function callable . callable is passed two string arguments, and it should return an integer :

1 if the first is ordered higher than the second

-1 if the first is ordered lower than the second

0 if they are ordered equal

The following example shows a reverse sorting collation:

Remove a collation function by setting callable to None .

Cambiato nella versione 3.11: The collation name can contain any Unicode character. Earlier, only ASCII characters were allowed.

Call this method from a different thread to abort any queries that might be executing on the connection. Aborted queries will raise an OperationalError .

Register callable authorizer_callback to be invoked for each attempt to access a column of a table in the database. The callback should return one of SQLITE_OK , SQLITE_DENY , or SQLITE_IGNORE to signal how access to the column should be handled by the underlying SQLite library.

The first argument to the callback signifies what kind of operation is to be authorized. The second and third argument will be arguments or None depending on the first argument. The 4th argument is the name of the database («main», «temp», etc.) if applicable. The 5th argument is the name of the inner-most trigger or view that is responsible for the access attempt or None if this access attempt is directly from input SQL code.

Please consult the SQLite documentation about the possible values for the first argument and the meaning of the second and third argument depending on the first one. All necessary constants are available in the sqlite3 module.

Passing None as authorizer_callback will disable the authorizer.

Cambiato nella versione 3.11: Added support for disabling the authorizer using None .

Cambiato nella versione 3.13: Passing authorizer_callback as a keyword argument is deprecated. The parameter will become positional-only in Python 3.15.

Register callable progress_handler to be invoked for every n instructions of the SQLite virtual machine. This is useful if you want to get called from SQLite during long-running operations, for example to update a GUI.

If you want to clear any previously installed progress handler, call the method with None for progress_handler .

Returning a non-zero value from the handler function will terminate the currently executing query and cause it to raise a DatabaseError exception.

Cambiato nella versione 3.13: Passing progress_handler as a keyword argument is deprecated. The parameter will become positional-only in Python 3.15.

The only argument passed to the callback is the statement (as str ) that is being executed. The return value of the callback is ignored. Note that the backend does not only run statements passed to the Cursor.execute() methods. Other sources include the transaction management of the sqlite3 module and the execution of triggers defined in the current database.

Passing None as trace_callback will disable the trace callback.

Exceptions raised in the trace callback are not propagated. As a development and debugging aid, use enable_callback_tracebacks() to enable printing tracebacks from exceptions raised in the trace callback.

Added in version 3.3.

Cambiato nella versione 3.13: Passing trace_callback as a keyword argument is deprecated. The parameter will become positional-only in Python 3.15.

Enable the SQLite engine to load SQLite extensions from shared libraries if enabled is True ; else, disallow loading SQLite extensions. SQLite extensions can define new functions, aggregates or whole new virtual table implementations. One well-known extension is the fulltext-search extension distributed with SQLite.

The sqlite3 module is not built with loadable extension support by default, because some platforms (notably macOS) have SQLite libraries which are compiled without this feature. To get loadable extension support, you must pass the --enable-loadable-sqlite-extensions option to configure .

Raises an auditing event sqlite3.enable_load_extension with arguments connection , enabled .

Added in version 3.2.

Cambiato nella versione 3.10: Added the sqlite3.enable_load_extension auditing event.

Load an SQLite extension from a shared library. Enable extension loading with enable_load_extension() before calling this method.

path ( str ) – The path to the SQLite extension.

entrypoint ( str | None ) – Entry point name. If None (the default), SQLite will come up with an entry point name of its own; see the SQLite docs Loading an Extension for details.

Raises an auditing event sqlite3.load_extension with arguments connection , path .

Cambiato nella versione 3.10: Added the sqlite3.load_extension auditing event.

Cambiato nella versione 3.12: Added the entrypoint parameter.

Return an iterator to dump the database as SQL source code. Useful when saving an in-memory database for later restoration. Similar to the .dump command in the sqlite3 shell.

filter ( str | None ) – An optional LIKE pattern for database objects to dump, e.g. prefix_% . If None (the default), all database objects will be included.

How to handle non-UTF-8 text encodings

Cambiato nella versione 3.13: Added the filter parameter.

Create a backup of an SQLite database.

Works even if the database is being accessed by other clients or concurrently by the same connection.

target ( Connection ) – The database connection to save the backup to.

pages ( int ) – The number of pages to copy at a time. If equal to or less than 0 , the entire database is copied in a single step. Defaults to -1 .

progress ( callback | None) – If set to a callable , it is invoked with three integer arguments for every backup iteration: the status of the last iteration, the remaining number of pages still to be copied, and the total number of pages. Defaults to None .

name ( str ) – The name of the database to back up. Either "main" (the default) for the main database, "temp" for the temporary database, or the name of a custom database as attached using the ATTACH DATABASE SQL statement.

sleep ( float ) – The number of seconds to sleep between successive attempts to back up remaining pages.

Example 1, copy an existing database into another:

Example 2, copy an existing database into a transient copy:

Added in version 3.7.

Get a connection runtime limit.

category ( int ) – The SQLite limit category to be queried.

ProgrammingError – If category is not recognised by the underlying SQLite library.

Example, query the maximum length of an SQL statement for Connection con (the default is 1000000000):

Set a connection runtime limit. Attempts to increase a limit above its hard upper bound are silently truncated to the hard upper bound. Regardless of whether or not the limit was changed, the prior value of the limit is returned.

category ( int ) – The SQLite limit category to be set.

limit ( int ) – The value of the new limit. If negative, the current limit is unchanged.

Example, limit the number of attached databases to 1 for Connection con (the default limit is 10):

Query a boolean connection configuration option.

op ( int ) – A SQLITE_DBCONFIG code .

Set a boolean connection configuration option.

enable ( bool ) – True if the configuration option should be enabled (default); False if it should be disabled.

Serialize a database into a bytes object. For an ordinary on-disk database file, the serialization is just a copy of the disk file. For an in-memory database or a «temp» database, the serialization is the same sequence of bytes which would be written to disk if that database were backed up to disk.

name ( str ) – The database name to be serialized. Defaults to "main" .

This method is only available if the underlying SQLite library has the serialize API.

Deserialize a serialized database into a Connection . This method causes the database connection to disconnect from database name , and reopen name as an in-memory database based on the serialization contained in data .

data ( bytes ) – A serialized database.

name ( str ) – The database name to deserialize into. Defaults to "main" .

OperationalError – If the database connection is currently involved in a read transaction or a backup operation.

DatabaseError – If data does not contain a valid SQLite database.

OverflowError – If len(data) is larger than 2**63 - 1 .

This method is only available if the underlying SQLite library has the deserialize API.

This attribute controls PEP 249 -compliant transaction behaviour. autocommit has three allowed values:

False : Select PEP 249 -compliant transaction behaviour, implying that sqlite3 ensures a transaction is always open. Use commit() and rollback() to close transactions.

This is the recommended value of autocommit .

True : Use SQLite’s autocommit mode . commit() and rollback() have no effect in this mode.

LEGACY_TRANSACTION_CONTROL : Pre-Python 3.12 (non- PEP 249 -compliant) transaction control. See isolation_level for more details.

This is currently the default value of autocommit .

Changing autocommit to False will open a new transaction, and changing it to True will commit any pending transaction.

See Transaction control via the autocommit attribute for more details.

The isolation_level attribute has no effect unless autocommit is LEGACY_TRANSACTION_CONTROL .

This read-only attribute corresponds to the low-level SQLite autocommit mode .

True if a transaction is active (there are uncommitted changes), False otherwise.

Controls the legacy transaction handling mode of sqlite3 . If set to None , transactions are never implicitly opened. If set to one of "DEFERRED" , "IMMEDIATE" , or "EXCLUSIVE" , corresponding to the underlying SQLite transaction behaviour , implicit transaction management is performed.

If not overridden by the isolation_level parameter of connect() , the default is "" , which is an alias for "DEFERRED" .

Using autocommit to control transaction handling is recommended over using isolation_level . isolation_level has no effect unless autocommit is set to LEGACY_TRANSACTION_CONTROL (the default).

The initial row_factory for Cursor objects created from this connection. Assigning to this attribute does not affect the row_factory of existing cursors belonging to this connection, only new ones. Is None by default, meaning each row is returned as a tuple .

See How to create and use row factories for more details.

A callable that accepts a bytes parameter and returns a text representation of it. The callable is invoked for SQLite values with the TEXT data type. By default, this attribute is set to str .

See How to handle non-UTF-8 text encodings for more details.

Return the total number of database rows that have been modified, inserted, or deleted since the database connection was opened.

Cursor objects ¶

A Cursor object represents a database cursor which is used to execute SQL statements, and manage the context of a fetch operation. Cursors are created using Connection.cursor() , or by using any of the connection shortcut methods . Cursor objects are iterators , meaning that if you execute() a SELECT query, you can simply iterate over the cursor to fetch the resulting rows: for row in cur . execute ( "SELECT t FROM data" ): print ( row )

A Cursor instance has the following attributes and methods.

Execute a single SQL statement, optionally binding Python values using placeholders .

sql ( str ) – A single SQL statement.

parameters ( dict | sequence ) – Python values to bind to placeholders in sql . A dict if named placeholders are used. A sequence if unnamed placeholders are used. See How to use placeholders to bind values in SQL queries .

ProgrammingError – If sql contains more than one SQL statement.

If autocommit is LEGACY_TRANSACTION_CONTROL , isolation_level is not None , sql is an INSERT , UPDATE , DELETE , or REPLACE statement, and there is no open transaction, a transaction is implicitly opened before executing sql .

Deprecato dalla versione 3.12, rimosso nella versione 3.14.: DeprecationWarning is emitted if named placeholders are used and parameters is a sequence instead of a dict . Starting with Python 3.14, ProgrammingError will be raised instead.

Use executescript() to execute multiple SQL statements.

For every item in parameters , repeatedly execute the parameterized DML SQL statement sql .

Uses the same implicit transaction handling as execute() .

sql ( str ) – A single SQL DML statement.

parameters ( iterable ) – An iterable of parameters to bind with the placeholders in sql . See How to use placeholders to bind values in SQL queries .

ProgrammingError – If sql contains more than one SQL statement, or is not a DML statement.

Any resulting rows are discarded, including DML statements with RETURNING clauses .

Deprecato dalla versione 3.12, rimosso nella versione 3.14.: DeprecationWarning is emitted if named placeholders are used and the items in parameters are sequences instead of dict s. Starting with Python 3.14, ProgrammingError will be raised instead.

Execute the SQL statements in sql_script . If the autocommit is LEGACY_TRANSACTION_CONTROL and there is a pending transaction, an implicit COMMIT statement is executed first. No other implicit transaction control is performed; any transaction control must be added to sql_script .

sql_script must be a string .

If row_factory is None , return the next row query result set as a tuple . Else, pass it to the row factory and return its result. Return None if no more data is available.

Return the next set of rows of a query result as a list . Return an empty list if no more rows are available.

The number of rows to fetch per call is specified by the size parameter. If size is not given, arraysize determines the number of rows to be fetched. If fewer than size rows are available, as many rows as are available are returned.

Note there are performance considerations involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, then it is best for it to retain the same value from one fetchmany() call to the next.

Return all (remaining) rows of a query result as a list . Return an empty list if no rows are available. Note that the arraysize attribute can affect the performance of this operation.

Close the cursor now (rather than whenever __del__ is called).

The cursor will be unusable from this point forward; a ProgrammingError exception will be raised if any operation is attempted with the cursor.

Required by the DB-API. Does nothing in sqlite3 .

Read/write attribute that controls the number of rows returned by fetchmany() . The default value is 1 which means a single row would be fetched per call.

Read-only attribute that provides the SQLite database Connection belonging to the cursor. A Cursor object created by calling con.cursor() will have a connection attribute that refers to con :

Read-only attribute that provides the column names of the last query. To remain compatible with the Python DB API, it returns a 7-tuple for each column where the last six items of each tuple are None .

It is set for SELECT statements without any matching rows as well.

Read-only attribute that provides the row id of the last inserted row. It is only updated after successful INSERT or REPLACE statements using the execute() method. For other statements, after executemany() or executescript() , or if the insertion failed, the value of lastrowid is left unchanged. The initial value of lastrowid is None .

Inserts into WITHOUT ROWID tables are not recorded.

Cambiato nella versione 3.6: Added support for the REPLACE statement.

Read-only attribute that provides the number of modified rows for INSERT , UPDATE , DELETE , and REPLACE statements; is -1 for other statements, including CTE queries. It is only updated by the execute() and executemany() methods, after the statement has run to completion. This means that any resulting rows must be fetched in order for rowcount to be updated.

Control how a row fetched from this Cursor is represented. If None , a row is represented as a tuple . Can be set to the included sqlite3.Row ; or a callable that accepts two arguments, a Cursor object and the tuple of row values, and returns a custom object representing an SQLite row.

Defaults to what Connection.row_factory was set to when the Cursor was created. Assigning to this attribute does not affect Connection.row_factory of the parent connection.

Row objects ¶

A Row instance serves as a highly optimized row_factory for Connection objects. It supports iteration, equality testing, len() , and mapping access by column name and index.

Two Row objects compare equal if they have identical column names and values.

Return a list of column names as strings . Immediately after a query, it is the first member of each tuple in Cursor.description .

Cambiato nella versione 3.5: Added support of slicing.

Blob objects ¶

A Blob instance is a file-like object that can read and write data in an SQLite BLOB . Call len(blob) to get the size (number of bytes) of the blob. Use indices and slices for direct access to the blob data.

Use the Blob as a context manager to ensure that the blob handle is closed after use.

Close the blob.

The blob will be unusable from this point onward. An Error (or subclass) exception will be raised if any further operation is attempted with the blob.

Read length bytes of data from the blob at the current offset position. If the end of the blob is reached, the data up to EOF will be returned. When length is not specified, or is negative, read() will read until the end of the blob.

Write data to the blob at the current offset. This function cannot change the blob length. Writing beyond the end of the blob will raise ValueError .

Return the current access position of the blob.

Set the current access position of the blob to offset . The origin argument defaults to os.SEEK_SET (absolute blob positioning). Other values for origin are os.SEEK_CUR (seek relative to the current position) and os.SEEK_END (seek relative to the blob’s end).

PrepareProtocol objects ¶

The PrepareProtocol type’s single purpose is to act as a PEP 246 style adaption protocol for objects that can adapt themselves to native SQLite types .

Exceptions ¶

The exception hierarchy is defined by the DB-API 2.0 ( PEP 249 ).

This exception is not currently raised by the sqlite3 module, but may be raised by applications using sqlite3 , for example if a user-defined function truncates data while inserting. Warning is a subclass of Exception .

The base class of the other exceptions in this module. Use this to catch all errors with one single except statement. Error is a subclass of Exception .

If the exception originated from within the SQLite library, the following two attributes are added to the exception:

The numeric error code from the SQLite API

The symbolic name of the numeric error code from the SQLite API

Exception raised for misuse of the low-level SQLite C API. In other words, if this exception is raised, it probably indicates a bug in the sqlite3 module. InterfaceError is a subclass of Error .

Exception raised for errors that are related to the database. This serves as the base exception for several types of database errors. It is only raised implicitly through the specialised subclasses. DatabaseError is a subclass of Error .

Exception raised for errors caused by problems with the processed data, like numeric values out of range, and strings which are too long. DataError is a subclass of DatabaseError .

Exception raised for errors that are related to the database’s operation, and not necessarily under the control of the programmer. For example, the database path is not found, or a transaction could not be processed. OperationalError is a subclass of DatabaseError .

Exception raised when the relational integrity of the database is affected, e.g. a foreign key check fails. It is a subclass of DatabaseError .

Exception raised when SQLite encounters an internal error. If this is raised, it may indicate that there is a problem with the runtime SQLite library. InternalError is a subclass of DatabaseError .

Exception raised for sqlite3 API programming errors, for example supplying the wrong number of bindings to a query, or trying to operate on a closed Connection . ProgrammingError is a subclass of DatabaseError .

Exception raised in case a method or database API is not supported by the underlying SQLite library. For example, setting deterministic to True in create_function() , if the underlying SQLite library does not support deterministic functions. NotSupportedError is a subclass of DatabaseError .

SQLite and Python types ¶

SQLite natively supports the following types: NULL , INTEGER , REAL , TEXT , BLOB .

The following Python types can thus be sent to SQLite without any problem:

Python type	SQLite type

This is how SQLite types are converted to Python types by default:

SQLite type	Python type



	depends on , by default

The type system of the sqlite3 module is extensible in two ways: you can store additional Python types in an SQLite database via object adapters , and you can let the sqlite3 module convert SQLite types to Python types via converters .

Default adapters and converters (deprecated) ¶

The default adapters and converters are deprecated as of Python 3.12. Instead, use the Adapter and converter recipes and tailor them to your needs.

The deprecated default adapters and converters consist of:

An adapter for datetime.date objects to strings in ISO 8601 format.

An adapter for datetime.datetime objects to strings in ISO 8601 format.

A converter for declared «date» types to datetime.date objects.

A converter for declared «timestamp» types to datetime.datetime objects. Fractional parts will be truncated to 6 digits (microsecond precision).

The default «timestamp» converter ignores UTC offsets in the database and always returns a naive datetime.datetime object. To preserve UTC offsets in timestamps, either leave converters disabled, or register an offset-aware converter with register_converter() .

Deprecato dalla versione 3.12.

Command-line interface ¶

The sqlite3 module can be invoked as a script, using the interpreter’s -m switch, in order to provide a simple SQLite shell. The argument signature is as follows:

Type .quit or CTRL-D to exit the shell.

Print CLI help.

Print underlying SQLite library version.

How-to guides ¶

How to use placeholders to bind values in sql queries ¶.

SQL operations usually need to use values from Python variables. However, beware of using Python’s string operations to assemble queries, as they are vulnerable to SQL injection attacks . For example, an attacker can simply close the single quote and inject OR TRUE to select all rows:

Instead, use the DB-API’s parameter substitution. To insert a variable into a query string, use a placeholder in the string, and substitute the actual values into the query by providing them as a tuple of values to the second argument of the cursor’s execute() method.

An SQL statement may use one of two kinds of placeholders: question marks (qmark style) or named placeholders (named style). For the qmark style, parameters must be a sequence whose length must match the number of placeholders, or a ProgrammingError is raised. For the named style, parameters must be an instance of a dict (or a subclass), which must contain keys for all named parameters; any extra items are ignored. Here’s an example of both styles:

PEP 249 numeric placeholders are not supported. If used, they will be interpreted as named placeholders.

How to adapt custom Python types to SQLite values ¶

SQLite supports only a limited set of data types natively. To store custom Python types in SQLite databases, adapt them to one of the Python types SQLite natively understands .

There are two ways to adapt Python objects to SQLite types: letting your object adapt itself, or using an adapter callable . The latter will take precedence above the former. For a library that exports a custom type, it may make sense to enable that type to adapt itself. As an application developer, it may make more sense to take direct control by registering custom adapter functions.

How to write adaptable objects ¶

Suppose we have a Point class that represents a pair of coordinates, x and y , in a Cartesian coordinate system. The coordinate pair will be stored as a text string in the database, using a semicolon to separate the coordinates. This can be implemented by adding a __conform__(self, protocol) method which returns the adapted value. The object passed to protocol will be of type PrepareProtocol .

How to register adapter callables ¶

The other possibility is to create a function that converts the Python object to an SQLite-compatible type. This function can then be registered using register_adapter() .

How to convert SQLite values to custom Python types ¶

Writing an adapter lets you convert from custom Python types to SQLite values. To be able to convert from SQLite values to custom Python types, we use converters .

Let’s go back to the Point class. We stored the x and y coordinates separated via semicolons as strings in SQLite.

First, we’ll define a converter function that accepts the string as a parameter and constructs a Point object from it.

Converter functions are always passed a bytes object, no matter the underlying SQLite data type.

We now need to tell sqlite3 when it should convert a given SQLite value. This is done when connecting to a database, using the detect_types parameter of connect() . There are three options:

Implicit: set detect_types to PARSE_DECLTYPES

Explicit: set detect_types to PARSE_COLNAMES

Both: set detect_types to sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES . Column names take precedence over declared types.

The following example illustrates the implicit and explicit approaches:

Adapter and converter recipes ¶

This section shows recipes for common adapters and converters.

How to use connection shortcut methods ¶

Using the execute() , executemany() , and executescript() methods of the Connection class, your code can be written more concisely because you don’t have to create the (often superfluous) Cursor objects explicitly. Instead, the Cursor objects are created implicitly and these shortcut methods return the cursor objects. This way, you can execute a SELECT statement and iterate over it directly using only a single call on the Connection object.

How to use the connection context manager ¶

A Connection object can be used as a context manager that automatically commits or rolls back open transactions when leaving the body of the context manager. If the body of the with statement finishes without exceptions, the transaction is committed. If this commit fails, or if the body of the with statement raises an uncaught exception, the transaction is rolled back. If autocommit is False , a new transaction is implicitly opened after committing or rolling back.

If there is no open transaction upon leaving the body of the with statement, or if autocommit is True , the context manager does nothing.

The context manager neither implicitly opens a new transaction nor closes the connection. If you need a closing context manager, consider using contextlib.closing() .

How to work with SQLite URIs ¶

Some useful URI tricks include:

Open a database in read-only mode:

Do not implicitly create a new database file if it does not already exist; will raise OperationalError if unable to create a new file:

Create a shared named in-memory database:

More information about this feature, including a list of parameters, can be found in the SQLite URI documentation .

How to create and use row factories ¶

By default, sqlite3 represents each row as a tuple . If a tuple does not suit your needs, you can use the sqlite3.Row class or a custom row_factory .

While row_factory exists as an attribute both on the Cursor and the Connection , it is recommended to set Connection.row_factory , so all cursors created from the connection will use the same row factory.

Row provides indexed and case-insensitive named access to columns, with minimal memory overhead and performance impact over a tuple . To use Row as a row factory, assign it to the row_factory attribute:

Queries now return Row objects:

The FROM clause can be omitted in the SELECT statement, as in the above example. In such cases, SQLite returns a single row with columns defined by expressions, e.g. literals, with the given aliases expr AS alias .

You can create a custom row_factory that returns each row as a dict , with column names mapped to values:

Using it, queries now return a dict instead of a tuple :

The following row factory returns a named tuple :

namedtuple_factory() can be used as follows:

With some adjustments, the above recipe can be adapted to use a dataclass , or any other custom class, instead of a namedtuple .

How to handle non-UTF-8 text encodings ¶

By default, sqlite3 uses str to adapt SQLite values with the TEXT data type. This works well for UTF-8 encoded text, but it might fail for other encodings and invalid UTF-8. You can use a custom text_factory to handle such cases.

Because of SQLite’s flexible typing , it is not uncommon to encounter table columns with the TEXT data type containing non-UTF-8 encodings, or even arbitrary data. To demonstrate, let’s assume we have a database with ISO-8859-2 (Latin-2) encoded text, for example a table of Czech-English dictionary entries. Assuming we now have a Connection instance con connected to this database, we can decode the Latin-2 encoded text using this text_factory :

For invalid UTF-8 or arbitrary data in stored in TEXT table columns, you can use the following technique, borrowed from the Unicode HOWTO :

The sqlite3 module API does not support strings containing surrogates.

Unicode HOWTO

Explanation ¶

Transaction control ¶.

sqlite3 offers multiple methods of controlling whether, when and how database transactions are opened and closed. Transaction control via the autocommit attribute is recommended, while Transaction control via the isolation_level attribute retains the pre-Python 3.12 behaviour.

Transaction control via the autocommit attribute ¶

The recommended way of controlling transaction behaviour is through the Connection.autocommit attribute, which should preferably be set using the autocommit parameter of connect() .

It is suggested to set autocommit to False , which implies PEP 249 -compliant transaction control. This means:

sqlite3 ensures that a transaction is always open, so connect() , Connection.commit() , and Connection.rollback() will implicitly open a new transaction (immediately after closing the pending one, for the latter two). sqlite3 uses BEGIN DEFERRED statements when opening transactions.

Transactions should be committed explicitly using commit() .

Transactions should be rolled back explicitly using rollback() .

An implicit rollback is performed if the database is close() -ed with pending changes.

Set autocommit to True to enable SQLite’s autocommit mode . In this mode, Connection.commit() and Connection.rollback() have no effect. Note that SQLite’s autocommit mode is distinct from the PEP 249 -compliant Connection.autocommit attribute; use Connection.in_transaction to query the low-level SQLite autocommit mode.

Set autocommit to LEGACY_TRANSACTION_CONTROL to leave transaction control behaviour to the Connection.isolation_level attribute. See Transaction control via the isolation_level attribute for more information.

Transaction control via the isolation_level attribute ¶

The recommended way of controlling transactions is via the autocommit attribute. See Transaction control via the autocommit attribute .

If Connection.autocommit is set to LEGACY_TRANSACTION_CONTROL (the default), transaction behaviour is controlled using the Connection.isolation_level attribute. Otherwise, isolation_level has no effect.

If the connection attribute isolation_level is not None , new transactions are implicitly opened before execute() and executemany() executes INSERT , UPDATE , DELETE , or REPLACE statements; for other statements, no implicit transaction handling is performed. Use the commit() and rollback() methods to respectively commit and roll back pending transactions. You can choose the underlying SQLite transaction behaviour — that is, whether and what type of BEGIN statements sqlite3 implicitly executes – via the isolation_level attribute.

If isolation_level is set to None , no transactions are implicitly opened at all. This leaves the underlying SQLite library in autocommit mode , but also allows the user to perform their own transaction handling using explicit SQL statements. The underlying SQLite library autocommit mode can be queried using the in_transaction attribute.

The executescript() method implicitly commits any pending transaction before execution of the given SQL script, regardless of the value of isolation_level .

Cambiato nella versione 3.6: sqlite3 used to implicitly commit an open transaction before DDL statements. This is no longer the case.

Cambiato nella versione 3.12: The recommended way of controlling transactions is now via the autocommit attribute.

Module functions
Module constants
Connection objects
Cursor objects
Row objects
Blob objects
PrepareProtocol objects
SQLite and Python types
Default adapters and converters (deprecated)
Command-line interface
How to write adaptable objects
How to register adapter callables
Adapter and converter recipes
How to work with SQLite URIs
Transaction control via the autocommit attribute
Transaction control via the isolation_level attribute

Argomento precedente

dbm — Interfaces to Unix «databases»

Argomento successivo

Data Compression and Archiving

Questa pagina

Riporta un Bug
Visualizza codice

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

Notifications You must be signed in to change notification settings

Aim-bot based on AI for all FPS games

SunOner/sunone_aimbot

Folders and files.

Name		Name
408 Commits

Repository files navigation

Sunone aimbot.

Sunone Aimbot is an AI-powered aim bot for first-person shooter games. It leverages the YOLOv8 and YOLOv10 models, PyTorch, and various other tools to automatically target and aim at enemies within the game. The AI model in repository has been trained on more than 30,000 images from popular first-person shooter games like Warface, Destiny 2, Battlefield 2042, CS:GO, Fortnite, The Finals, CS2 and more.

Use it at your own risk, we do not guarantee that you may be blocked!

This application only works on Nvidia graphics cards. AMD support is testing. See AI_enable_AMD option. The recommended graphics card for starting and more productive and stable operation starts with the rtx 20 series.

Tested Environment

The sunone aimbot has been tested on the following environment:.

Windows	10 and 11(priority)
Python:	3.11.6
CUDA:	12.4
TensorRT:	10.0.1
Ultralytics:	8.2.48
GitHub AI Model:	0.4.1 (YOLOv8)
Boosty AI Model:	0.5.7 (YOLOv10)

The behavior of the aimbot can be configured via the config.ini file. Here are the available options:

Object Search window resolution:

detection_window_width int : Horizontal resolution of the object search window.
detection_window_height int : Vertical resolution of the object search window.

Bettercam capture method:

Bettercam_capture bool : Use Bettercam to capture images from the screen.
bettercam_capture_fps int : Specific fps value for screen capture.
bettercam_monitor_id int : Id of the monitor from which the images will be captured.
bettercam_gpu_id int : Id of the GPU to be used for image capture

Obs capture method:

Obs_capture bool : Use Obs to capture images from the screen.
Obs_camera_id str or int : auto or number of Virtual Camera ID.
Obs_capture_fps int : Specific fps value for screen capture.
body_y_offset float : Allows correction of y coordinates inside the body detected box if head is not detected.
hideout_targets bool : Allows shooting at targets on the range (for example in warface on the polygon or in aimlabs).
disable_headshot bool : Disable head targerting.
disable_prediction bool : Disable target position prediction.
third_person bool : Turn on the third-person game mode. (sunxds_0.5.7+)
The names of all the keys are here . Type None is empty button.
hotkey_targeting str : Aiming at the target. Supports multi keys, for example hotkey_targeting = RightMouseButton,X2MouseButton
hotkey_exit str : Exit.
hotkey_pause str : Pause AIM.
hotkey_reload_config str : Reload config.
mouse_dpi int : Mouse DPI.
mouse_sensitivity float : Aim sensitivity.
mouse_fov_width int : The current horizontal value of the viewing angle in the game.
mouse_fov_height int : The current vertical value of the viewing angle in the game.
mouse_lock_target bool : True: Press once to permanently aim at the target, press again to turn off the aiming. False: Hold down the button to constantly aim at the target.
mouse_auto_aim bool : Automatic targeting.
mouse_ghub bool : Uses Logitech GHUB exploit for mouse movement. If the value is False, native win32 library is used for movement.
auto_shoot bool : Automatic shooting. (For some games need arduino ).
triggerbot bool : Automatic shooting at a target if it is in the scope, requires the mouse_auto_shoot option enabled, and aiming will also be automatically turned off.
force_click bool : Shooting will be performed even if the sight is not located within the object.
bScope_multiplier float : The multiplier of the target trigger size.
arduino_move bool : Sends a command to the arduino to move the mouse.
arduino_shoot bool : Sends a command to the arduino to fire with the mouse.
arduino_port str : Arduino COM port. Use COM1 or COM2 ... or auto .
arduino_baudrate int : Custom Arduino baudrate.
arduino_16_bit_mouse bool : Send 16 bit data to the arduino port to move the mouse.
AI_model_name str : AI model name.
AI_model_image_size int : AI model image size.
AI_conf float : How many percent is AI sure that this is the right goal.
AI_device int or str : Device to run on, 0 , 1 ... or cpu .
AI_enable_AMD bool : Enable support Amd GPUs. Install ROCm, Zluda and PyTorch. See AMD docs .
AI_mouse_net bool : Use a neural network to calculate mouse movements. See this repository .
show_overlay bool : Enables the overlay. It is not recommended for gameplay, only for debugging.
overlay_show_borders bool : Displaying the borders of the overlay.
overlay_show_boxes bool : Display of boxes.
overlay_show_target_line bool : Displaying the line to the target.
overlay_show_target_prediction_line bool : Displaying the predictive line to the target.
overlay_show_labels bool : Displaying label names.
overlay_show_conf bool : Displaying the label names as well as the confidence level.

Debug window:

show_window bool : Shows the OpenCV2 window for visual feedback.
show_detection_speed bool : Displays speed information inside the debug window.
show_window_fps bool : Displays FPS in the corner.
show_boxes bool : Displays detectable objects.
show_labels bool : Displays the name of the detected object.
show_conf bool : Displays object confidence threshold for detection.
show_target_line bool : Shows the mouse finishing line.
show_target_prediction_line bool : Show mouse prediction line.
show_bScope_box bool : Show the trigger box for auto shooting.
show_history_points bool : Show history points.
debug_window_always_on_top bool : The debug window will always be on top of other windows.
spawn_window_pos_x int : When the debugging window starts, it takes the x position.
spawn_window_pos_y int : When the debugging window starts, it takes the y position.
debug_window_scale_percent int : Adjusts the size of the debug window.
The names of the debugging window can be written in the file window_names.txt they will be randomly selected.
*.pt: Default AI model.
*.onnx: The model is optimized to run on processors.
*.engine: Final exported model, which is faster than the previous two.

Export .pt model to .engine

All commands are executed in the console window:
First, go to the aimbot directory using the command:
Then export the model from the .pt format in .engine format.
model="model_path/model_name.pt" : Path to model.
format=engine : TensorRT model format.
half=true : Use Half-precision floating-point format.
device=0 : GPU id.
workspace=8 : GPU max video memory.
verbose=False : Debug stuff. Convenient function, can show errors when exporting.
Each model has its own image size with which it was trained, export only with the image size with which it was published.

Notes / Recommendations

Limit the maximum value of frames per second in the game in which you will use it. And also do not set the screen resolution to high. Do not overload the graphics card.
Do not set high graphics settings in games.
Limit the browser (try not to watch YouTube while playing and working AI at the same time, for example (of course if you don't have a super duper graphics card)) and so on, which loads the video card.
Try to use TensorRT for acceleration. .pt model is good, but does not have as much speed as .engine .
Turn off the cv2 debug window, this saves system resources.
Do not increase the object search window resolution, this may affect your search speed.
If you have started the application and nothing happens, it may be working, close it with the F2 key and change the show_window option to True in the file config.ini to make sure that the application is working.

Support the project

I will post new models here .

This project is licensed under the MIT License. See LICENSE for details

Contributors 4

Python 99.3%
Batchfile 0.7%

Microsoft Azure
Google Cloud Platform
Documentation
Database objects in Databricks

What are Unity Catalog volumes?

Volumes are Unity Catalog objects that enable governance over non-tabular datasets. Volumes represent a logical volume of storage in a cloud object storage location. Volumes provide capabilities for accessing, storing, governing, and organizing files.

While tables provide governance over tabular datasets, volumes add governance over non-tabular datasets. You can use volumes to store and access files in any format, including structured, semi-structured, and unstructured data.

Databricks recommends using volumes to govern access to all non-tabular data. Like tables, volumes can be managed or external.

You cannot use volumes as a location for tables. Volumes are intended for path-based data access only. Use tables when you want to work with tabular data in Unity Catalog.

The following articles provide more information about working with volumes:

Create and manage volumes .

Manage files in volumes .

Explore storage and find data files .

Managed vs. external volumes .

What are the privileges for volumes? .

When you work with volumes, you must use a SQL warehouse or a cluster running Databricks Runtime 13.3 LTS or above, unless you are using Databricks UIs such as Catalog Explorer.

What is a managed volume?

A managed volume is a Unity Catalog-governed storage volume created within the managed storage location of the containing schema. See Specify a managed storage location in Unity Catalog .

Managed volumes allow the creation of governed storage for working with files without the overhead of external locations and storage credentials. You do not need to specify a location when creating a managed volume, and all file access for data in managed volumes is through paths managed by Unity Catalog.

What is an external volume?

An external volume is a Unity Catalog-governed storage volume registered against a directory within an external location using Unity Catalog-governed storage credentials.

Unity Catalog does not manage the lifecycle and layout of the files in external volumes. When you drop an external volume, Unity Catalog does not delete the underlying data.

What path is used for accessing files in a volume?

Volumes sit at the third level of the Unity Catalog three-level namespace ( catalog.schema.volume ):

Unity Catalog object model diagram, focused on volume

The path to access volumes is the same whether you use Apache Spark, SQL, Python, or other languages and libraries. This differs from legacy access patterns for files in object storage bound to a Databricks workspace.

The path to access files in volumes uses the following format:

Databricks also supports an optional dbfs:/ scheme when working with Apache Spark, so the following path also works:

The sequence /<catalog>/<schema>/<volume> in the path corresponds to the three Unity Catalog object names associated with the file. These path elements are read-only and not directly writeable by users, meaning it is not possible to create or delete these directories using filesystem operations. They are automatically managed and kept in sync with the corresponding Unity Catalog entities.

You can also access data in external volumes using cloud storage URIs.

Example notebook: Create and work with volumes

The following notebook demonstrates the basic SQL syntax to create and interact with Unity Catalog volumes.

Tutorial: Unity Catalog volumes notebook

Reserved paths for volumes.

Volumes introduces the following reserved paths used for accessing volumes:

dbfs:/Volumes

Paths are also reserved for potential typos for these paths from Apache Spark APIs and dbutils , including /volumes , /Volume , /volume , whether or not they are preceded by dbfs:/ . The path /dbfs/Volumes is also reserved, but cannot be used to access volumes.

Volumes are only supported on Databricks Runtime 13.3 LTS and above. In Databricks Runtime 12.2 LTS and below, operations against /Volumes paths might succeed, but can write data to ephemeral storage disks attached to compute clusters rather than persisting data to Unity Catalog volumes as expected.

If you have pre-existing data stored in a reserved path on the DBFS root, you can file a support ticket to gain temporary access to this data to move it to another location.

Limitations

You must use Unity Catalog-enabled compute to interact with Unity Catalog volumes. Volumes do not support all workloads.

Volumes do not support dbutils.fs commands distributed to executors.

The following limitations apply:

In Databricks Runtime 14.3 LTS and above:

On single-user user clusters, you cannot access volumes from threads and subprocesses in Scala.

In Databricks Runtime 14.2 and below:

On compute configured with shared access mode, you can’t use UDFs to access volumes.

Both Python or Scala have access to FUSE from the driver but not from executors.

Scala code that performs I/O operations can run on the driver but not the executors.

On compute configured with single user access mode, there is no support for FUSE in Scala, Scala IO code accessing data using volume paths, or Scala UDFs. Python UDFs are supported in single user access mode.

On all supported Databricks Runtime versions:

Unity Catalog UDFs do not support accessing volume file paths.

You cannot access volumes from RDDs.

You cannot use spark-submit with JARs stored in a volume.

You cannot define dependencies to other libraries accessed via volume paths inside a wheel or JAR file.

You cannot list Unity Catalog objects using the /Volumes/<catalog-name> or /Volumes/<catalog-name>/<schema-name> patterns. You must use a fully-qualified path that includes a volume name.

The DBFS endpoint for the REST API does not support volumes paths.

Volumes are excluded from global search results in the Databricks workspace.

You cannot specify volumes as the destination for cluster log delivery.

%sh mv is not supported for moving files between volumes. Use dbutils.fs.mv or %sh cp instead.

You cannot create a custom Hadoop file system with volumes, meaning the following is not supported:

"PermissionError: [Errno 13] Permission denied"in windows

I have already run anaconda as an administrator, but it still shows “PermissionError: [Errno 13] Permission denied”. I thought it was because the folder attribute is read-only, so i tried several ways to change this such as CMD checks and powershell checks but the folder still shows read-only. The windows support said that this was not the issue and the display of read-only is something can’t be changed. So what am i suppose to do next. Thanks.

What code are you executing to get this error?

I’m new here and don’t know much about the format. If there is anything inappropriate, please let me know. Thank you.

What is the full traceback? What are the relevant variables at that point in time?

Traceback (most recent call last): File “D:\Project\guben\yunxing2.py”, line 129, in main() File “D:\Project\guben\yunxing2.py”, line 118, in main recorded_text = recognize_speech(file_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “D:\Project\guben\yunxing2.py”, line 75, in recognize_speech wf = wave.open(file_path, “rb”) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File “D:\Anaconda\envs\myenv\Lib\wave.py”, line 649, in open return Wave_read(f) ^^^^^^^^^^^^ File “D:\Anaconda\envs\myenv\Lib\wave.py”, line 282, in init f = builtins.open(f, ‘rb’) ^^^^^^^^^^^^^^^^^^^^^^ PermissionError: [Errno 13] Permission denied: ‘D:\Project\guben\record_save’

So the problem or that you are trying to open a folder as it is were a file, which didn’t work. On windows this produces a Permission Error instead of something more helpful.

Topic		Replies	Views	Activity
Python Help	20	22923	January 4, 2023
Python Help	7	5248	December 2, 2023
Python Help	1	796	January 22, 2021
Python Help	23	2728	May 13, 2023
Python Help	8	3188	April 20, 2022

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Get early access and see previews of new features.

zipfile.write() : relative path of files, reproduced in the zip archive

Using zip file, I indicate files locate within an other folder for example: './data/2003-2007/metropolis/Matrix_0_1_0.csv'

My problem is that, when I extract it, the files are found in ./data/2003-2007/metropolis/Matrix_0_1_0.csv , while I would like it to be extract in ./

Here is my code:

Here is the print of src and dst:

python-zipfile

See stackoverflow.com/questions/4917284/… – devnull Commented May 29, 2013 at 8:53
Thanks but, it doesn't seem to be linked. He is trying to extract, I'm trying to compress. He doesn't use ZipFile.write() – Touki Commented May 29, 2013 at 8:55
For writing without preserving the directory structure, see stackoverflow.com/questions/7007868/… – devnull Commented May 29, 2013 at 8:57
Yep. I had read it, but I didn't get it clearly – Touki Commented May 29, 2013 at 9:26

4 Answers 4

As shown in: Python: Getting files into an archive without the directory?

The solution is:

Maybe better solution in this case to use tarfile :

As written in the documentation there is a parameter of ZipFile.write called arcname .

So, you can use is to name your file(s) as you want. Note: to be able to make it dynamical, you should consider to import pathlib library. In your case:

If you want to get every files under a directory, and then make a zip from those files, you can do something like this:

I know it was years ago, but maybe it will be useful for someone.

Your Answer

Reminder: Answers generated by artificial intelligence tools are not allowed on Stack Overflow. Learn more

Sign up or log in

Post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged python path python-zipfile or ask your own question .

The Overflow Blog
How to build open source apps in a highly regulated industry
Community Products Roadmap Update, July 2024
Featured on Meta
We spent a sprint addressing your requests — here’s how it went
Upcoming initiatives on Stack Overflow and across the Stack Exchange network...
Policy: Generative AI (e.g., ChatGPT) is banned
The [lib] tag is being burninated
What makes a homepage useful for logged-in users

Hot Network Questions

How to connect to FTP server between multiple subnets
How much time do I need on my Passport expiry date to leave Australia for South Africa?
y / p does not paste all yanked lines
Does the Grimme D3 correction improve band gaps of vdW heterostructures?
Can you help me to identify the aircraft in a 1920s photograph?
Plausible reasons for the usage of Flying Ships
What does it mean if Deutsche Bahn say that a train is cancelled between two instances of the same stop?
Does Binah come from the heart or brain - Birkat Ahavat Olam
Why is pressure in the outermost layer of a star lower than at its center?
How does this switch work on each press?
In equation (3) from lecture 7 in Leonard Susskind’s ‘Classical Mechanics’, should the derivatives be partial?
Why do I see low voltage in a repaired underground cable?
Why didn't Jimmy Neutron realize immediately when he read the note on the refrigerator that the note is phony, as the note says "son or daughter..."?
In the UK, how do scientists address each other?
Can someone explain the Trump immunity ruling?
Book in 90's (?) about rewriting your own genetic code
Do Christians believe that Jews and Muslims go to hell?
Why would a plane be allowed to fly to LAX but not Maui?
How far back in time have historians estimated the rate of economic growth and the economic power of various empires?
Books using the axiomatic method
Minimum number of players for 9 round Swiss tournament?
Are there any parts of the US Constitution that state that the laws apply universally to all citizens?
How to maintain dependencies shared among microservices?
Is there a drawback to using Heart's blood rote repeatedly?

IMAGES

How to create a Zip file in Python?
Create ZIP files in Python
zipfile
Create ZIP files in Python
Python Zip File Example
How to work with ZIP files in Python

VIDEO

Extracting a ZipFile using Python
Python #103
Zip multiple files using Python
how to create zip file in python #zipfile #python #coding #pythontricks #codinglife #pythonlover
View files inside Zip file using Python
Metasploit RCE

COMMENTS

Working with zip files in Python
Here we pass the directory to be zipped to the get_all_file_paths() function and obtain a list containing all file paths. with ZipFile('my_python_files.zip','w') as zip: Here, we create a ZipFile object in WRITE mode this time. for file in file_paths: zip.write(file) Here, we write all the files to the zip file one by one using write method.
zipfile
ZipFile Objects¶ class zipfile. ZipFile (file, mode = 'r', compression = ZIP_STORED, allowZip64 = True, compresslevel = None, *, strict_timestamps = True, metadata_encoding = None) ¶. Open a ZIP file, where file can be a path to a file (a string), a file-like object or a path-like object.. The mode parameter should be 'r' to read an existing file, 'w' to truncate and write a new file, 'a' to ...
python
Using Python from the shell. You can do this with Python from the shell also using the zipfile module: $ python -m zipfile -c zipname sourcedir. Where zipname is the name of the destination file you want (add .zip if you want it, it won't do it automatically) and sourcedir is the path to the directory.
Python's zipfile: Manipulate Your ZIP Files Efficiently
Knowing how to create, read, write, populate, extract, and list ZIP files using the zipfile module is a useful skill to have as a Python developer or a DevOps engineer. In this tutorial, you'll learn how to: Read, write, and extract files from ZIP files with Python's zipfile; Read metadata about the content of ZIP files using zipfile
Creating a Zip Archive of a Directory in Python
myzip.write( 'document.txt' ) In this code, we're creating a new zip archive named compressed_file.zip and just adding document.txt to it. The 'w' parameter means that we're opening the zip file in write mode. Now, if you check your directory, you should see a new zip file named compressed_file.zip.
Zip and unzip files with zipfile and shutil in Python
In Python, the zipfile module allows you to zip and unzip files, i.e., compress files into a ZIP file and extract a ZIP file. zipfile — Work with ZIP archives — Python 3.11.4 documentation. You can also easily zip a directory (folder) and unzip a ZIP file using the make_archive() and unpack_archive() functions from the shutil module.
Zip a File in Python: Compress and Bundle Multiple Files with Our Examples
You can also create a zip file containing multiple files. Here is an example: 'file_to_compressed2.txt', 'file_to_compressed3.txt'. zip_object.write(file_name, compress_type=zipfile.ZIP_DEFLATED) In the above example, we defined the names of multiple source files in a list.
Python Zip
Python Interview Questions on Python Zip Files. Q1. Write a program to print all the contents of the zip file 'EmployeeReport.zip'. Ans 1. Complete code is as follows: from zipfile import ZipFile with ZipFile('EmployeeReport.zip', 'r') as file: file.printdir() Output.
How to Zip and Unzip Files in Python • datagy
zip .extract( 'file1.txt', '/Users/datagy/') Note that we're opening the zip file using the 'r' method, indicating we want to read the file. We then instruct Python to extract file1.txt and place it into the /Users/datagy/ directory. In many cases, you'll want to extract more than a single file.
Python Zip File Example
zipfile Module. The first thing you need to work with zip files in python is zipfile module. This module provides tools to create, read, write, append, and list a ZIP file. This module does not currently handle multi-disk ZIP files. It can handle ZIP files that use the ZIP64 extensions (ZIP files that are more than 4 GByte in size).
A complete guide for working with I/O streams and zip archives in Python 3
Python's print statement takes a keyword argument called file that decides which stream to write the given message/objects. Its value is a " file-like object." See the definition of print,
How to create a zip file using Python?
Creating uncompressed ZIP file in Python. Uncompressed ZIP files do not reduce the size of the original directory. As no compression is sharing uncompressed ZIP files over a network has no advantage as compared to sharing the original file. Using shutil.make_archive to create Zip file. Python has a standard library shutil which can be used to ...
How to work with ZIP files in Python
Files can be compressed without losing any data. Python has built-in support for ZIP files. #more. In this article, we will learn how ZIP files can be read, written, extracted, and listed in Python. List ZIP file contents¶ The zipfile module in Python, a part of the built-in libraries, can be used to manipulate ZIP files. It is advised to work ...
Python: How to create a zip archive from multiple files or Directory
Create a zip archive from multiple files in Python. Steps are, Create a ZipFile object by passing the new file name and mode as 'w' (write mode). It will create a new zip file and open it within ZipFile object. Call write () function on ZipFile object to add the files in it. call close () on ZipFile object to Close the zip file.
Read and write ZIP archive files
It supports methods for reading data about existing archives as well as modifying the archives by adding additional files. To read the names of the files in an existing archive, use namelist (): import zipfile zf = zipfile.ZipFile('example.zip', 'r') print zf.namelist() The return value is a list of strings with the names of the archive ...
zip() in Python
In this article, we will see a Python program that will crack the zip file's password using the brute force method. The ZIP file format is a common archive and compression standard. It is used to compress files. Sometimes, compressed files are confidential and the owner doesn't want to give its access to every individual. Hence, the zip file is pro
10. More Python modules
10. More Python modules¶. gzip - compresses/decompresses binary data, reads/writes data into *.gzip file.. zipfile - creates, adds, reads, writes to zip archives.. getpass - reads password from keyboard.. glob - finds files with matched regexp patterns.. json - encodes/decodes python object into json, reads/writes jason files.. argparse - passes parameters to a python script.
A Gentle Introduction to Python's unittest Module
Now that the application works, you will write automated tests for it in the next section. Step 2 — Writing your first test. In this section and the following ones, you'll use unittest to write automated tests that ensure the format_file_size() function works correctly. This includes verifying the proper formatting of various file sizes.
How to Setup a Project in Snowpark Using a Python IDE
How to set up an existing Snowpark project on your local system using a Python IDE. Add a Python UDF to the existing codebase and deploy the function directly in Snowflake. Validate the function deployment locally and test from Snowflake as well. Dive deep into the inner workings of the Snowpark Python UDF deployment.
Python ZIP fájl példával
A Python lehetővé teszi a zip/tar létrehozását archives gyorsan. Following parancs a teljes könyvtárat tömöríti, shutil.make_archive(kimeneti_fájlnév, 'zip', könyvtár_neve)
Python, write in memory zip to file
with ZipFile(read_file, 'r') as zipread: with ZipFile(file_write_buffer, 'w', ZIP_DEFLATED) as zipwrite: for item in zipread.infolist(): # Copy all ZipInfo attributes for each file since defaults are not preseved dest.CRC = item.CRC dest.date_time = item.date_time dest.create_system = item.create_system dest.compress_type = item.compress_type dest.external_attr = item.external_attr dest ...
Python处理压缩文件的终极指南-腾讯云开发者社区-腾讯云
总结. 本文详细介绍了如何使用Python自动化处理压缩文件，包括读取、创建、添加和解压ZIP和TAR文件。通过使用Python内置的zipfile、tarfile和shutil模块，开发者可以高效地管理压缩文件，实现自动化文件处理。文中提供了丰富的示例代码，展示了如何在实际应用中使用这些模块进行文件备份和解压操作。
sqlite3
Tutorial¶. In this tutorial, you will create a database of Monty Python movies using basic sqlite3 functionality. It assumes a fundamental understanding of database concepts, including cursors and transactions.. First, we need to create a new database and open a database connection to allow sqlite3 to work with it. Call sqlite3.connect() to create a connection to the database tutorial.db in ...
SunOner/sunone_aimbot: Aim-bot based on AI for all FPS games
mouse_dpi int: Mouse DPI.; mouse_sensitivity float: Aim sensitivity.; mouse_fov_width int: The current horizontal value of the viewing angle in the game.; mouse_fov_height int: The current vertical value of the viewing angle in the game.; mouse_lock_target bool: True: Press once to permanently aim at the target, press again to turn off the aiming.False: Hold down the button to constantly aim ...
Amazon MWAA best practices for managing Python dependencies
Because a private webserver can't access the PyPI repository through the internet, pip will install the dependencies from the .zip file. If you're using a public webserver configuration, you also benefit from a static .zip file, which makes sure the package information remains unchanged until it is explicitly rebuilt.
What are Unity Catalog volumes?
The path to access volumes is the same whether you use Apache Spark, SQL, Python, or other languages and libraries. This differs from legacy access patterns for files in object storage bound to a Databricks workspace. The path to access files in volumes uses the following format: /
python
11. ZipFile.write(filename, [arcname[, compress_type]]) takes the name of a local file to be added to the zip file. To write data from a bytearray or bytes object you need to use the ZipFile.writestr(zinfo_or_arcname, bytes[, compress_type]) method instead shown below: zipFile.writestr('name_of_file_in_archive', zipContents) Note: if request ...
"PermissionError: [Errno 13] Permission denied"in windows
I have already run anaconda as an administrator, but it still shows "PermissionError: [Errno 13] Permission denied". I thought it was because the folder attribute is read-only, so i tried several ways to change this such as CMD checks and powershell checks but the folder still shows read-only. The windows support said that this was not the issue and the display of read-only is something ...
Authentication System using Microsoft Entra ID
In app_config.py add metioned below lines in the beginning of file; import dotenv; dotenv.load_dotenv() We are all set lets run . python app.py . Conclusion: As we conclude this guide on mastering Microsoft Entra, you now possess the knowledge to enhance your application's security and streamline user management.
python
As shown in: Python: Getting files into an archive without the directory? The solution is: ''' zip_file: @src: Iterable object containing one or more element @dst: filename (path/filename if needed) @arcname: Iterable object containing the names we want to give to the elements in the archive (has to correspond to src) ''' def zip_files(src, dst, arcname=None): zip_ = zipfile.ZipFile(dst, 'w ...