Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Understanding the Simple Repository API for Python Package Indexes

Tech 3

Base HTML API

Requesting the root URL / returns an HTML5 page with hyperlinks for each project. The link text is the project name, and the href attribute points to the project's page.

<html>
  <body>
    <a href="/simple/frob/">frob</a>
    <a href="/simple/project-name/">PROJECT_name</a>
  </body>
</html>

Each project URL returns an HTML5 page with hyperlinks for each file. The href attribute includes a download link and a hash parameter, such as https://files.pythonhosted.org/packages/.../project-none-any.whl#sha256=d8c8a.... Additional rules:

  • Project URLs must end with /. Non-terminating URLs redirect to the slash-terminated version.
  • URLs may be relative or absolute, as long as they point to the correct location.
  • File locations are not constrained to the repository's relative path.
  • Pages may contain other tags besides hyperlinks.
  • Repositories may redirect non-normalized URLs (e.g., /FooBar/ to /foobar/), but clients should not rely on this and must use normalized URLs.
  • Repositories should use a hash function from Python's hashlib module (e.g., md5, sha256), with sha256 currently recommended.
  • For GPG-signed files, a signature file with the same name and .asc extension must be present.
  • An attribute data-gpg-sig may indicate GPG signature status with values true or false.
  • The attribute data-requires-python may specify Python version requirements, encoded with HTML entities (e.g., &gt;=3).

Normalized Names

Project names must contain only ASCII letters, digits, ., -, and _. Normalized names convert to lowercase and replace sequences of -, _, and . with a single -.

import re

def normalize_project_name(project_name):
    return re.sub(r"[-_.]+", "-", project_name).lower()

Yank Support

Links may have a data-yanked attribute, indicating the file is yanked. The attribute value, if present, is a string explaining the reason. Yanked files should generally be ignored by installers, except in specific cases like exact version matches (== or ===) or lock files.

Installers should avoid yanked versions unless necessary to satisfy constraints. Mirrors may ignore yanked files or include them with the data-yanked attribute, but must not mirror yanked files without this attribute.

API Versioning

Project pages include a meta tag with name="pypi:repository-version" and content specifying a version in Major.Minor format (e.g., 1.0). Major version changes indicate backward-incompatible updates, while minor changes are backward-compatible.

Clients should check the repository version in responses, defaulting to 1.0 if absent. If the major version exceeds the client's expectation, it must fail with an error. For higher minor versions, a warning is recommended.

Distribution Metadata

Links may include a data-dist-info-metadata attribute, indicating that a core metadata file exists at {file_url}.metadata. The attribute value is <hashname>=<hashvalue> or true. If absent, tools should fall back to downloading the distribution and checking metadata.

JSON-based Simple API

This API extends the HTML API by serializing responses in JSON, except for file downloads and HTML responses. It uses the same versioning scheme as the HTML API.

JSON Serialization Rules

  • All JSON responses are objects, not arrays or other types.
  • URL strings may be absolute or relative.
  • Clients should ignore unrecognized keys.
  • All responses include a meta key with metadata, including api-version.
  • Requirements from the Base HTML API apply, excluding HTML-specific aspects.

Project List

The root URL / returns a JSON object with:

  • projects: A list of objects, each with a name key for the project name.
  • meta: Response metadata.
{
  "meta": {
    "api-version": "1.0"
  },
  "projects": [
    {"name": "Frob"},
    {"name": "spamspamspam"}
  ]
}

Project Details

Project URLs /<project>/ return a JSON object with:

  • name: Normalized project name.
  • files: A list of objects representing files.
  • meta: Metadata.

Each file object includes:

  • filename: File name.
  • url: Download URL.
  • hashes: Dictionary mapping hash names to values (e.g., {"sha256": "..."}).
  • requires-python: Optional Python version requirement.
  • dist-info-metadata: Optional; boolean or dictionary with metadata file hashes.
  • gpg-sig: Optional boolean for GPG signature.
  • yanked: Optional boolean or string for yank status.
{
  "meta": {
    "api-version": "1.0"
  },
  "name": "holygrail",
  "files": [
    {
      "filename": "holygrail-1.0.tar.gz",
      "url": "https://example.com/files/holygrail-1.0.tar.gz",
      "hashes": {"sha256": "...", "blake2b": "..."},
      "requires-python": ">=3.7",
      "yanked": "Had a vulnerability"
    }
  ]
}

Content Types

Responses use Content-Type headers with format application/vnd.pypi.simple.$version+format. For version 1.0:

  • JSON: application/vnd.pypi.simple.v1+json
  • HTML: application/vnd.pypi.simple.v1+html

Legacy clients may use text/html for HTML responses.

Version and Format Selection

Clients use HTTP content negotiation via the Accept header to request specific versions and formats. Servers select a supported type or respond with 406 Not Acceptable. Quality values (q=) can indicate preferences.

import requests

accept_types = [
    "application/vnd.pypi.simple.v1+json",
    "application/vnd.pypi.simple.v1+html;q=0.2",
    "text/html;q=0.01"
]
headers = {"Accept": ", ".join(accept_types)}
response = requests.get("https://pypi.org/simple/", headers=headers)
content_type = response.headers.get("Content-Type", "")
if "json" in content_type:
    process_json(response.json())
elif "html" in content_type:
    process_html(response.text)
else:
    raise ValueError("Unsupported content type")

Alternative mechanisms include URL parameters (e.g., ?format=application/vnd.pypi.simple.v1+json) or endpoint configuration (e.g., /simple/v1+json/).

Additional Fields in API Version 1.1

For JSON responses in version 1.1 or higher:

  • api-version must be 1.1 or greater.
  • A top-level versions key lists all project versions as strings.
  • File objects include mandatory size (bytes) and optional upload-time (ISO 8601 format).
  • Keys starting with underscores are reserved for server use.

Metadata Attribute Renaming

In HTML, use data-core-metadata for metadata attributes, with fallback to data-dist-info-metadata. In JSON, use core-metadata with fallback to dist-info-metadata.

Tags: PythonPyPI

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.