Indexing API vs sitemap submission?

Sitemap is the batch low-priority pathway. The Indexing API is the real-time higher-priority pathway.

Automate URL Submission to Google Search Index API: A Step-by-Step Guide in Python

Updated May 2026. Refreshed Python script with proper exponential backoff, 429 rate-limit handling, and a section on what the Indexing API actually does in 2026 versus what people assume it does.

The Google Indexing API is one of those tools that looks more powerful than it actually is. Submitting URLs through it does not guarantee indexing. It puts them in a priority crawl queue. For time-sensitive content (job postings, event pages, breaking news) that is genuinely useful. For evergreen content it does very little that a clean sitemap would not. We are Osher Digital, a Brisbane-based AI and automation consultancy, and this is the working script and operational notes we use on our own site and on client SEO automation work.

This guide covers what the Google Indexing API actually does in 2026, when it is the right tool, the Google Indexing API Python code to use it cleanly (auth, rate limits, retry logic), and the alternatives for the cases where the API is not the right answer. The code below works against the current v3 endpoint as of May 2026 and uses the same Google Search Index endpoint you would hit from any language.

If you only need the script, jump to the section labelled “The Working Python Script” further down. If you want to understand why the script is structured the way it is, read straight through.

What the Google Indexing API Actually Does

Three sentences. The Google Indexing API takes a URL plus a notification type (URL_UPDATED or URL_DELETED) and tells Google that the URL has changed. Google then queues that URL for crawling at higher priority than the default sitemap-driven schedule. The API does not guarantee that the URL will be indexed, only that it will be queued for crawl.

What it explicitly does not do:

Index the URL directly. The crawler still has to visit and decide.
Override Google’s quality signals. A thin or duplicate page will not get indexed faster.
Work reliably for all content types. Officially it is for JobPosting and BroadcastEvent structured data. In practice it works for any URL but Google reserves the right to change that.
Replace your sitemap. The sitemap is still the primary path for the bulk of your URLs.
Help with re-ranking. The API affects crawl, not ranking.

The most common misconception we see is that submitting a URL through the API guarantees indexing. It does not. We have seen plenty of URLs submitted via the API that never got indexed because the underlying page quality was thin. The API speeds up Google noticing the page. It cannot force Google to want it.

When the Google Indexing API Is the Right Tool

Three real use cases where the Google Indexing API earns its place.

Time-sensitive content with structured data. Job boards posting new roles, ticketing sites listing new events, news sites publishing breaking stories. These are the documented use cases and they work well. The API gets the page into the crawl queue within hours instead of days.

Large sitemaps where new URLs get buried. If your sitemap has 50,000 URLs and Google crawls 200 per day, new content can wait weeks to be picked up. The API jumps the queue. We have used this on a client e-commerce site with 80,000 product URLs where new products were waiting two weeks for first crawl. The API brought that to under six hours.

URL removals you need acknowledged quickly. URL_DELETED notifications are the cleanest way to tell Google a URL is gone. Useful for taking down expired job postings, removed products, or content with legal removal requests.

Outside those three cases, the marginal benefit is small. For most sites, a clean sitemap and a healthy internal linking structure does the job the API would do. The case for adding the API to your stack should be specific.

Setting Up Google Cloud Authentication

The Google Indexing API uses service account authentication. The setup has six steps and one of them (Search Console ownership) is the one that catches everyone.

Create a Google Cloud project at console.cloud.google.com. Note the project ID.
In the project, go to APIs and Services, Library. Search for Indexing API and enable it.
Go to IAM and Admin, Service Accounts. Create a new service account. Give it a clear name like “indexing-api-prod”. The role is irrelevant for the API itself, so set it to none or Viewer.
On the service account detail page, go to Keys. Add Key, Create new key, JSON. Download the JSON file. This is the file your Python script will use.
Go to Search Console. Open the property you want the API to submit URLs for.
Settings, Users and Permissions, Add User. Paste the service account email (it looks like [email protected]). Set the permission to Owner. Save.

Step 6 is the one nobody does correctly the first time. If you skip it, every API call returns 403 Forbidden. The service account literally needs to be an Owner of the Search Console property, not just a User or a Restricted-permissions account. Wait five to ten minutes after adding it before testing.

The Working Python Script

Three modules. Authentication, single-URL submission with proper error handling, and a sitemap-driven main loop with rate-limit handling. The script is structured this way because the failure modes are different at each layer and you want clean separation.

Dependencies

Install via pip:

pip install requests google-auth google-auth-httplib2 lxml

Authentication module

Loads the JSON key file and returns an AuthorizedSession that signs every request.

from google.oauth2 import service_account
from google.auth.transport.requests import AuthorizedSession

SCOPES = ["https://www.googleapis.com/auth/indexing"]
SERVICE_ACCOUNT_FILE = "./service-account-key.json"

def get_authed_session():
    credentials = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES
    )
    return AuthorizedSession(credentials)

URL submission with retries

Submits a single URL. Distinguishes 403 (auth problem, give up immediately), 429 (rate limit, stop the whole run), 5xx (transient, retry with exponential backoff), and other errors (give up on this URL).

import time

API_URL = "https://indexing.googleapis.com/v3/urlNotifications:publish"

def submit_url(session, url, notification_type="URL_UPDATED"):
    payload = {"url": url, "type": notification_type}
    response = session.post(API_URL, json=payload)

    if response.status_code == 200:
        return "ok"
    if response.status_code == 403:
        raise PermissionError("Service account is not a Search Console owner")
    if response.status_code == 429:
        return "rate_limited"
    if response.status_code in (500, 502, 503, 504):
        return "retry"
    return f"error_{response.status_code}"

def submit_with_backoff(session, url, max_retries=5):
    for attempt in range(max_retries):
        result = submit_url(session, url)
        if result == "ok":
            return True
        if result == "rate_limited":
            return False  # stop the run, retry tomorrow
        if result == "retry":
            time.sleep(2 ** attempt)
            continue
        return False
    return False

The exponential backoff (2 ** attempt) gives us 1, 2, 4, 8, 16 second delays before giving up. That is enough to ride out a brief Google service hiccup without burning the quota.

Sitemap-driven main loop

Reads the sitemap, deduplicates against URLs we have already submitted (stored in a CSV), and submits each new URL through the rate-limited submission function.

import csv
import os
from lxml import etree
import requests

SITEMAP_URL = "https://yoursite.com/sitemap.xml"
CSV_FILE = "submitted_urls.csv"

def get_sitemap_urls(sitemap_url):
    response = requests.get(sitemap_url, timeout=30)
    response.raise_for_status()
    root = etree.fromstring(response.content)
    ns = "{http://www.sitemaps.org/schemas/sitemap/0.9}"
    return [loc.text for loc in root.findall(f".//{ns}loc")]

def read_submitted(csv_file):
    if not os.path.exists(csv_file):
        return set()
    with open(csv_file) as f:
        return {row[0] for row in csv.reader(f) if row}

def append_submitted(csv_file, url):
    with open(csv_file, "a", newline="") as f:
        csv.writer(f).writerow([url])

def main():
    session = get_authed_session()
    submitted = read_submitted(CSV_FILE)
    urls = get_sitemap_urls(SITEMAP_URL)
    new_urls = [u for u in urls if u not in submitted]

    print(f"Submitting {len(new_urls)} of {len(urls)} URLs")
    for url in new_urls:
        if submit_with_backoff(session, url):
            append_submitted(CSV_FILE, url)
            print(f"ok: {url}")
        else:
            print(f"stopped at: {url}")
            break

if __name__ == "__main__":
    main()

The CSV-as-state pattern is deliberately boring. You can read it, edit it, and reset it manually. We have run this exact pattern on production sites for two years without needing a database. If you want something fancier, swap the CSV for SQLite or Redis. The interface is the same.

Quotas, Rate Limits, and Daily Operations

Defaults you need to plan around:

200 URL submissions per day per service account. This is the hard daily quota for new accounts. Going over returns 429. Once you hit it, the API rejects everything until midnight Pacific Time.
600 URL submissions per minute burst limit. You will not hit this in normal operation but it matters if you batch-submit.
Quota increase requests. Available via the Google Cloud Console with a documented use case. Large publishers routinely get quotas of 10,000+ per day.
No retroactive submission. The API does not have a backfill mode. If you missed a day, the URLs are not “behind” in any meaningful way. Just submit them tomorrow.

We run this script as a daily cron job at 6am Brisbane time. It picks up any URLs from the sitemap that have not been submitted before, submits them up to the daily quota, and stops cleanly when the quota hits. Run-time is usually under two minutes.

Common Pitfalls We Have Hit

Five things that cost us debugging time. Worth knowing.

403 Forbidden after Search Console permission added. Google’s permission propagation can take anywhere from five minutes to a few hours. Wait at least 30 minutes before assuming the permission did not stick.

Service account JSON file committed to git. We have seen this happen. Treat the JSON file like a password. Add it to .gitignore. Better yet, store it in a secret manager (AWS Secrets Manager, GCP Secret Manager, 1Password CLI) and load it at runtime.

Multiple Search Console properties. If you have both the www and non-www properties registered, or both http and https, you need to add the service account as Owner to the specific property the URL belongs to. The property-set version (with the umbrella property) is the safest because it covers all the variants.

Sitemap returning HTML instead of XML. Some CDNs return cached HTML when the sitemap URL is requested without the right Accept header. Use requests.get with a User-Agent header that includes “GoogleBot” or similar to avoid the cached HTML version.

Idempotency with the CSV. If two cron jobs run at the same time (overlap on the hour), they both read the CSV, submit the same URLs, and both write entries. Solution: file lock the CSV (use flock on Linux or msvcrt.locking on Windows) or use a proper job lock with Redis. We use a simple lock file pattern.

Alternatives for Different Use Cases

The Indexing API is not the only tool. Three real alternatives, each with a use case.

Search Console URL inspection. Manual, one-URL-at-a-time, but works for any content type. Right for ad-hoc submissions when you just want to add URL to Google index for one specific page.
IndexNow. Cross-engine API supported by Bing, Yandex, and several others (notably not Google). Different semantics but similar intent. We pair it with the Google Indexing API on client sites with multi-engine SEO goals.
RSS sitemap pinging. Old pattern. Generate an RSS feed of new URLs and ping the major search engines. Mostly superseded but still works as a backup.
Just submit a clean sitemap. The boring answer. A sitemap that updates promptly, contains all your URLs, and pings Search Console on update will do most of the work the API does. Resort to the API when this is not enough.

If you are running broader SEO automation, our piece on related posts with TF-IDF covers an adjacent script we use on the same sites.

When the Google Indexing API Is the Wrong Tool

Three cases where reaching for this API does not help.

Your problem is content quality, not crawl speed. If pages are not indexing because Google considers them low quality, the API will not help. The pages will be crawled and then skipped. Fix the content first.

Your sitemap is broken. A broken sitemap is a sign of deeper issues (sitemap returning HTML, 500 errors on some URLs, malformed XML). Fix the sitemap first. The API is not a workaround for a broken sitemap.

You are trying to get already-indexed URLs to rank better. The API affects crawl scheduling, not ranking. Submitting an already-indexed page does nothing for its position. The right tools for ranking work are content improvement, internal linking, and earning external links.

For most evergreen content on most sites, you do not need this API. For time-sensitive content on sites with high content velocity, it earns its place. Match the tool to the actual problem.

Frequently Asked Questions

What is the Google Indexing API and what does it actually do?

The Google Indexing API lets you notify Google about URL changes (added or removed pages) so the crawler queues them faster than waiting for the normal sitemap crawl cycle. It is intended for JobPosting and BroadcastEvent structured data only, although it does work for general URLs in practice. Submitting URLs through the API does not guarantee they will be indexed. It only puts them in a priority queue.

Can I use the Google Indexing API for normal blog posts?

Officially no, the API is documented for JobPosting and BroadcastEvent only. In practice it still queues general URLs for crawl. Google has not removed the wider functionality but has reminded developers that it is not the intended use case. Use it for time-sensitive content where the sitemap delay matters. Do not expect indexing-rate improvements for evergreen content.

What are the quotas on the Google Indexing API?

The default quota is 200 URL submissions per day per service account, with a burst limit of 600 per minute. You can request a quota increase via the Google Cloud Console if you have a real volume need. We have seen large publishers get quotas in the 10,000 per day range with documented use cases.

How do I authenticate the Google Indexing API in Python?

Create a service account in Google Cloud, generate a JSON key file, and use google.oauth2.service_account.Credentials.from_service_account_file with the indexing scope. Then add the service account email as an Owner of your Search Console property. Skipping the Search Console owner step is the single most common reason the API returns 403 Forbidden.

Why is my Google Indexing API returning 403 Forbidden?

Almost always because the service account is not added as an Owner in Search Console for the property the URL belongs to. The OAuth2 scope and the JSON key file can be correct, but without Search Console ownership the API rejects the request.

Can the Google Indexing API force Google to recrawl a page?

It can put the URL in a priority queue. It cannot force the crawler to actually visit. Most URLs submitted via the API get crawled within a few hours. Some never get crawled, especially if the page returns thin content or low-quality signals.

What is the difference between the Indexing API and the Search Console sitemap submission?

The sitemap submission is the long-form, batch, low-priority pathway. The Indexing API is the short-form, real-time, higher-priority pathway. Use the sitemap for the corpus, the API for time-sensitive events.

Are there alternatives to the Google Indexing API for faster indexing?

Three real alternatives. Search Console URL inspection for one-off URLs. IndexNow for Bing and Yandex. And a clean, well-pinged sitemap, which does most of the work the API would do.

If you need help wiring this into a wider SEO automation stack (sitemap monitoring, IndexNow, link tracking, content freshness checks), get in touch. We work with publishers, e-commerce sites, and content platforms on this kind of programmatic SEO infrastructure. If you want to talk through whether the Indexing API is right for your specific case, book a call and we will walk through it.

Google Indexing API in Python: Auth, Quotas, and Working Code