tutorials|February 04, 2021|2 min read

Azure Storage Blob - How to List and Download Blob from Azure Storage container in Python (No Azure library)

TL;DR

Use Python's requests library to call Azure Storage REST APIs directly to list and download blobs without any Azure SDK dependency.

Azure Storage Blob - How to List and Download Blob from Azure Storage container in Python (No Azure library)

Introduction

In this tutorial we will see, How to list and download storage container blobs without using Azure python libraries.

Note: There is no azure library used, just rest api calls.

Pre-requisite

This tutorial is based upon Python-3.7

Pypy Dependency

We would require requests.

Complete Code

import requests
import re
import os

def _get_file_list_helper(container, next_marker=None):
  """
  Get the files list by using next_marker
  """
  account_name = container['account_name']
  container_name = container['container_name']
  curl_url = f'https://{account_name}.blob.core.windows.net/{container_name}?restype=container&comp=list&' 
  if next_marker:
    curl_url += f'marker={next_marker}&'
  curl_url += container['sas_token']

  print('Executing rest call to azure')
  r = requests.get(curl_url)
  text = r.text

  # this marker indicates there are more files
  next_marker = re.findall('<NextMarker>([^<]*)</NextMarker>',text)
  file_names = re.findall('<Name>([^<]*)</Name>',text)

  return {'files': file_names, 'next_marker': next_marker}  

def get_file_list(container):
  """
  Get the files list
  """
  files = []
  next_marker = None
  while True:
    files_data = _get_file_list_helper(container, next_marker)
    files.extend(files_data['files'])
    if not files_data['next_marker']:
      break
    next_marker = files_data['next_marker'][0]
  return files

def dowload_files(container, local_dest_path):
  files = get_file_list(container)

  account_name = container['account_name']
  container_name = container['container_name']
  url_path = f'https://{account_name}.blob.core.windows.net/{container_name}/'
  url_end_path = '?'  + container['sas_token']

  for file_name in files:
    print(f'Downloading: {file_name}')
    url = f'{url_path}{file_name}{url_end_path}'
    path = f'{local_dest_path}/{file_name}'
    if not os.path.exists(os.path.dirname(path)):
      os.makedirs(os.path.dirname(path))

      # make the request
      r = requests.get(url)

    # write the file
    with open(path, "wb") as download_file:
      download_file.write(r.content)

## main starts here
local_dest_path = './container_blob'

container = {
    'account_name': 'account_name',
    'container_name': 'container_name',
    'sas_token': 'xxxxxxxxxx'
}
dowload_files(container, local_dest_path)

Explanation

The code is very simple to understand. We are using Azure REST APIs to list and download storage blobs.

next_marker understanding

In cases, where there are more files in your storage container. The response does not have all the files in one response call. It instead returns a fixed number of items and a next_marker. Which indicates, there are more files. This marker has to be sent in next requests.

Usage with Azure Official Python Libraries

For usage with Azure official Python libraries, see: List and Download Azure blobs by Azure Python Libraries

Response to get blob Rest API

<?xml version="1.0" encoding="utf-8"?><EnumerationResults ServiceEndpoint="https://hubbledmeprodlocb.blob.core.windows.net/" ContainerName="container_name">
  <Blobs>
    <Blob>
      <Name>abc/test.log</Name>
      <Properties>
        <Last-Modified>Mon, 02 Dec 2019 09:42:50 GMT</Last-Modified>
        <Etag>0x8D7770BFF1CC8A1</Etag>
        <Content-Length>423</Content-Length>
        <Content-Type>application/octet-stream</Content-Type>
        <Content-Encoding /><Content-Language />
        <Content-MD5>3ycLC3CutKkybJtlgvEdsQ==</Content-MD5>
        <Cache-Control />
        <Content-Disposition />
        <BlobType>BlockBlob</BlobType>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
        <ServerEncrypted>true</ServerEncrypted>
      </Properties>
    </Blob>
  ...
  </Blobs>
  <NextMarker>marker_id</NextMarker>

</EnumerationResults>

Hope it helps.

Related Posts

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Introduction This post has the complete code to send email through smtp server…

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Introduction It is very important to introduce few process so that your code and…

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Introduction We often require to execute in timed manner, i.e. to specify a max…

How to Solve Circular Import Error in Python

How to Solve Circular Import Error in Python

Introduction To give some context, I have two python files. (Both in same folder…

Python Code - How To Read CSV with Headers into an Array of Dictionary

Python Code - How To Read CSV with Headers into an Array of Dictionary

Introduction Lets assume we have a csv something similar to following: Python…

Python Code - How To Read CSV into an Array of Arrays

Python Code - How To Read CSV into an Array of Arrays

Introduction In last post, we saw How to read CSV with Headers into Dictionary…

Latest Posts

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Most developers use Claude Code like a search engine — ask a question, get an…

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Every office lobby has the same problem: a visitor walks in, nobody’s at the…

Server Security Best Practices — Complete Hardening Guide for Production Systems

Server Security Best Practices — Complete Hardening Guide for Production Systems

Every breach post-mortem tells the same story: an unpatched service, a…

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

If you’re a Senior Engineer (L5) preparing for Staff (L6+) roles at MAANG…

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF have been in the OWASP Top 10 for over a decade. They’re among the…

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

The OWASP Top 10 is the industry standard for web application security risks. If…