How to list and download all files from a url directory using python ?

Published: October 26, 2022

Updated: November 21, 2022

Tags: Python; urlretrieve; BeautifulSoup;

DMCA.com Protection Status

Example of how to list all files from an url directory and how to download them:

Introduction

Example of case: a colleague sent you a url link (for example 'https://**/pub/') with a list of files (see image below). The goal here is to list and download all files that ends with ".nc":

How to list and download all files from an url directory using python ?
How to list and download all files from an url directory using python ?

List all files under the url directory

The first step is to create a list with all file links. To do that a solution is to use requests

import requests

url = 'https://******/pub/'

page = requests.get(url).text

Then to extract only the links from the page a solution is to use BeautifulSoup

from bs4 import BeautifulSoup

soup = BeautifulSoup(page, 'html.parser')

links = [url + '/' + node.get('href') for node in soup.find_all('a') if node.get('href').endswith('nc')]

Download all files locally

To download all files, a solution is to use urlretrieve:

import urllib.request

for link in links:
    print(link)
    filename = link.split('/')[-1]
   print(filename)
    urllib.request.urlretrieve(link,filename)

See also

Image

of