Example of how to list all files from an url directory and how to download them:
Table of contents
Introduction
Example of case: a colleague sent you a url link (for example 'https://**/pub/') with a list of files (see image below). The goal here is to list and download all files that ends with ".nc":
List all files under the url directory
The first step is to create a list with all file links. To do that a solution is to use requests
import requests
url = 'https://******/pub/'
page = requests.get(url).text
Then to extract only the links from the page a solution is to use BeautifulSoup
from bs4 import BeautifulSoup
soup = BeautifulSoup(page, 'html.parser')
links = [url + '/' + node.get('href') for node in soup.find_all('a') if node.get('href').endswith('nc')]
Download all files locally
To download all files, a solution is to use urlretrieve:
import urllib.request
for link in links:
print(link)
filename = link.split('/')[-1]
print(filename)
urllib.request.urlretrieve(link,filename)
See also
Links | Tags |
---|---|
How to download a file (pdf, text,...) from a url using python ? | Python; urlretrieve |
How to download a web pdf file from its url in python ? | Python; urlretrieve |