Introduction
This tutorial shows how to list and download all files from a public URL directory — for example, when someone shares a web directory like https://******/pub/ containing multiple files.
We’ll focus on downloading all files that end with .nc (NetCDF files), but you can adapt the method for any file type.
Downloading Files from a URL Directory
Imagine a colleague sends you a link to a web directory — for example:
1 | https://******/pub/ |
When you open it in your browser, you see a list of files (see image below).
Your goal is to automatically list and download all the files ending with .nc using Python.

.nc files from a URL directory using Python.
Step 1 — List All Files Under the URL Directory
The first step is to retrieve the HTML content of the directory page. You can do this easily with the requests library:
1 2 3 4 | import requests url = 'https://******/pub/' page = requests.get(url).text |
Next, you need to extract all file links from the page.
A simple and efficient way to do that is by using BeautifulSoup:
1 2 3 4 5 6 7 8 9 10 | from bs4 import BeautifulSoup soup = BeautifulSoup(page, 'html.parser') # Filter only files ending with ".nc" links = [url + node.get('href') for node in soup.find_all('a') if node.get('href').endswith('.nc')] print("Found files:") for link in links: print(link) |
This will give you a list of complete URLs for all .nc files in the directory.
Step 2 — Download All Files Locally
Once you have the list of links, you can download each file using urllib.request.urlretrieve:
1 2 3 4 5 6 7 8 | import urllib.request for link in links: filename = link.split('/')[-1] print(f"Downloading: {filename}") urllib.request.urlretrieve(link, filename) print("✅ Download completed!") |
This script will:
- Extract the filename from each URL.
- Download each file and save it locally in the current directory.
Optional Improvements
Add a progress bar with tqdm:
1 2 3 4 5 | from tqdm import tqdm for link in tqdm(links, desc="Downloading files"): filename = link.split('/')[-1] urllib.request.urlretrieve(link, filename) |
Handle errors gracefully:
1 2 3 4 | try: urllib.request.urlretrieve(link, filename) except Exception as e: print(f"❌ Failed to download {filename}: {e}") |
Change output directory:
1 2 3 4 5 6 7 8 | import os output_dir = "downloads" os.makedirs(output_dir, exist_ok=True) for link in links: filename = os.path.join(output_dir, link.split('/')[-1]) urllib.request.urlretrieve(link, filename) |
See Also
| Links | Tags |
|---|---|
| How to download a file (PDF, text, …) from a URL using Python? | Python; urlretrieve |
| How to download a web PDF file from its URL in Python? | Python; urlretrieve |
References
| Links | Site |
|---|---|
| Requests — Official Documentation | requests.readthedocs.io |
| BeautifulSoup — Official Documentation | crummy.com |
urllib.request — Python Standard Library |
docs.python.org |
tqdm — Progress Bar for Python |
tqdm.github.io |
