DigiLocker API Integration with Meri Pehchaan

Digital identity and verified document access are becoming core building blocks for modern citizen services, financial onboarding, education platforms, healthcare systems, and enterprise compliance workflows.

In India, DigiLocker and Meri Pehchaan play a major role in this transformation. DigiLocker enables citizens to access and share verified digital documents, while Meri Pehchaan provides a single sign-on (SSO) layer for secure authentication and consent-based access.

Deceptive Simplicity: For many teams, DigiLocker integration looks simple at the beginning: redirect the user, get authorization, fetch documents, and store the result. But in real production environments, the complexity starts much deeper.

You need to handle OAuth flows, PKCE security, token expiry, document scopes, XML/PDF extraction, citizen profile mapping, re-authentication scenarios, and verified data synchronization. Without the right architecture, teams often end up with broken user journeys, token failures, incomplete document mapping, and hardcoded integration logic that becomes difficult to maintain.

At VAF.ai, we approach DigiLocker integration as a complete lifecycle: Authenticate the citizen → collect consent → manage tokens → fetch profile/documents → extract verified fields → sync with business records → handle errors and re-authentication.

Why DigiLocker Integration Matters

DigiLocker is not just a document storage system. It is a trusted digital document exchange layer that helps platforms reduce manual uploads, improve verification accuracy, and build faster onboarding journeys.

For government and enterprise applications, DigiLocker integration can support use cases such as:

Citizen identity verification
Education certificate verification
Caste, income, residence, and other certificate access
Applicant profile auto-fill
KYC and onboarding
Document-based eligibility checks
Verified profile creation
Digital service delivery workflows

Instead of asking users to manually upload scanned copies, an application can request consent and fetch verified documents directly from DigiLocker. This creates a better user experience and improves data reliability.

However, to make this work in production, the integration must be secure, configurable, and resilient.

What is Meri Pehchaan?

Meri Pehchaan is the Government of India’s single sign-on platform built on DigiLocker. It supports Aadhaar-linked identity verification and document access through OAuth 2.0, OpenID Connect (OIDC), and PKCE-based authorization flows.

In a typical implementation, Meri Pehchaan acts as the authentication layer, while DigiLocker APIs provide access to profile and document data after user consent.

A secure implementation usually includes:

OAuth 2.0 authorization code flow
PKCE (Proof Key for Code Exchange) with S256 challenge method
OpenID Connect identity flow
Consent-based document access
Access token and refresh token handling
Profile and document API calls
Application-level session generation

A simple login integration may only need an openid scope. But if your application needs to access issued documents, additional scopes such as document access permissions are required. The integration must clearly separate authentication-only flows from document-access flows.

High-Level DigiLocker Integration Architecture

A production-grade architecture usually contains the following components to prevent tight coupling and ensure secure flows:

Client Application The web or mobile application where the citizen starts the login or document access flow.
IAM / Authentication Service The backend service that generates the Meri Pehchaan redirect URL, manages PKCE, handles callbacks, exchanges authorization codes, and issues local application tokens.
Meri Pehchaan Authorization Server The SSO provider that authenticates the user and returns the authorization code.
DigiLocker APIs APIs used to fetch the user profile, issued documents, XML data, and PDF files.
Application Session Store Stores local JWT sessions, refresh tokens, and metadata required for further DigiLocker access.
Golden Record / Business Record Store Stores verified citizen profile data and document-derived fields.
Document Mapping Layer Maps DigiLocker fields, XML nodes, and document attributes into application-specific fields.

This separation is important because DigiLocker integration should not be tightly coupled with one form, one service, or one hardcoded document type.

A scalable implementation should allow different services to configure which document type to fetch, which XML/PDF field to extract, and where that value should be mapped inside the application.

Step 1: Generate Meri Pehchaan SSO Redirect URL

The first step is to generate a secure SSO redirect URL.

In a production implementation, the backend should generate a PKCE code verifier and a corresponding code challenge. The client stores the verifier temporarily (e.g. in secure session cookie) and redirects the user to the Meri Pehchaan authorization URL.

The redirect request typically includes the following parameters:

Parameter	Purpose
`response_type`	Usually `code`
`client_id`	OAuth client ID provided by DigiLocker / Meri Pehchaan
`redirect_uri`	Registered callback URL where auth code is returned
`state`	Client-generated state parameter for request validation (CSRF protection)
`scope`	Requested scope, such as `openid` or `files.issueddocs`
`code_challenge`	SHA-256 hashed PKCE challenge
`code_challenge_method`	Usually `S256`

In the technical implementation, the SSO redirect API generates a Meri Pehchaan authorization URL using PKCE, where the code_verifier is created first, hashed using SHA-256, and sent as the code_challenge. This approach prevents authorization code interception attacks and is especially important for modern web and mobile applications.

             sso-redirect.ts
            TypeScript / Node.js
          

            import crypto from 'crypto';

interface RedirectParams {
  clientId: string;
  redirectUri: string;
  scope: string;
}

export function generateSSORedirectURL({ clientId, redirectUri, scope }: RedirectParams) {
  // 1. Generate secure random 43-character PKCE code_verifier
  const codeVerifier = crypto
    .randomBytes(32)
    .toString('base64url');

  // 2. Hash code_verifier using SHA-256 to create code_challenge
  const codeChallenge = crypto
    .createHash('sha256')
    .update(codeVerifier)
    .digest()
    .toString('base64url');

  // 3. Generate random state for CSRF validation
  const state = crypto.randomBytes(16).toString('hex');

  // 4. Build authorization URL
  const authUrl = new URL('https://meripehchaan.gov.in/oauth2/authorize');
  authUrl.searchParams.append('response_type', 'code');
  authUrl.searchParams.append('client_id', clientId);
  authUrl.searchParams.append('redirect_uri', redirectUri);
  authUrl.searchParams.append('state', state);
  authUrl.searchParams.append('scope', scope);
  authUrl.searchParams.append('code_challenge', codeChallenge);
  authUrl.searchParams.append('code_challenge_method', 'S256');

  return {
    url: authUrl.toString(),
    state,
    codeVerifier
  };
}
          

Step 2: Handle OAuth Callback and Token Exchange

After the citizen successfully authenticates through Meri Pehchaan, the browser is redirected back to your application callback URI with an authorization code.

The backend then performs the token exchange:

Receive authorization code and state from callback request.
Validate the state parameter against the stored state to prevent CSRF attacks.
Send the authorization code and original PKCE verifier to the Meri Pehchaan token endpoint.
Receive the access token and refresh token.
Use the access token to fetch the DigiLocker user profile.
Create or update the citizen record in your database.
Issue local application JWT access and refresh tokens.

Design Best Practice: Your application should not expose raw DigiLocker tokens directly to the frontend for long-term use. Instead, the backend should manage secure token storage in session metadata and issue application-level tokens for internal access control.

             sso-callback.ts
            TypeScript / Express
          

            import axios from 'axios';

export async function handleCallback(req, res) {
  const { code, state } = req.query;
  
  // 1. Retrieve stored state and code verifier from secure session cookie
  const storedState = req.cookies.sso_state;
  const codeVerifier = req.cookies.sso_verifier;

  if (state !== storedState) {
    return res.status(400).json({ error: 'State validation failed (CSRF mismatch)' });
  }

  try {
    // 2. Exchange authorization code & PKCE verifier for tokens
    const tokenResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
      grant_type: 'authorization_code',
      code,
      client_id: process.env.DIGILOCKER_CLIENT_ID,
      client_secret: process.env.DIGILOCKER_CLIENT_SECRET,
      redirect_uri: process.env.DIGILOCKER_REDIRECT_URI,
      code_verifier: codeVerifier
    });

    const { access_token, refresh_token, expires_in } = tokenResponse.data;

    // 3. Fetch citizen profile from DigiLocker API
    const profileResponse = await axios.get('https://digilocker.gov.in/api/v1/profile', {
      headers: { Authorization: `Bearer ${access_token}` }
    });

    const citizen = profileResponse.data; // Contains name, dob, gender, digilockerid

    // 4. Upsert citizen in database (Golden Record) & issue local JWT
    const userSession = await db.citizens.upsertAndLogin({
      profile: citizen,
      tokens: { access_token, refresh_token, expires_in }
    });

    res.cookie('app_session', userSession.jwt, { httpOnly: true, secure: true });
    return res.redirect('/dashboard');
  } catch (error) {
    console.error('SSO exchange failed:', error);
    return res.redirect('/login?error=sso_failed');
  }
}
          

Authentication-Only vs Document Access Flow

A common mistake in DigiLocker integration is treating all login flows the same. In reality, you should separate standard login and document access login:

Flow Type	Scope	Purpose
Standard login	`openid`	Authenticate citizen and fetch profile. Used for logging into citizen dashboards.
Document access login	`openid files.issueddocs`	Authenticate citizen and access issued documents. Used for applying for services/certificates.

The technical implementation separates these flows using different endpoints: one for normal SSO redirect and another for document access redirect. The document access flow can also include an optional req_doctype parameter to preselect a specific document type.

By separating these flows, the application can reduce unnecessary consent requests and improve user experience.

Managing DigiLocker Token Lifecycle

Token lifecycle management is one of the most important parts of a production DigiLocker integration. A basic implementation may work during initial testing, but production users will face scenarios like:

Access token expired
Refresh token missing or failed
Multiple API calls trying to refresh the token at the same time (race condition)
DigiLocker rejecting the stored token

The technical implementation handles token lifecycle by storing DigiLocker tokens in session metadata and managing refresh logic automatically. It also uses a per-user async lock to avoid race conditions during concurrent refresh attempts.

             token-lifecycle-manager.ts
            TypeScript / Redis / Node.js
          

            import { Lock } from 'async-lock'; // Prevents race conditions across parallel requests

const lock = new Lock();

export async function getValidDigiLockerToken(userId: string): Promise<string> {
  // Lock token refresh execution per unique user ID
  return lock.acquire(userId, async () => {
    const session = await db.sessions.findByUserId(userId);
    
    if (!session || !session.refresh_token) {
      throw new Error('402: RE_AUTH_REQUIRED');
    }

    const now = Math.floor(Date.now() / 1000);
    // If token is valid for more than 5 minutes, return it
    if (session.token_expires_at - now > 300) {
      return session.access_token;
    }

    // If token is expired or expiring soon, execute OAuth 2.0 refresh
    try {
      const refreshResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
        grant_type: 'refresh_token',
        refresh_token: session.refresh_token,
        client_id: process.env.DIGILOCKER_CLIENT_ID,
        client_secret: process.env.DIGILOCKER_CLIENT_SECRET
      });

      const { access_token, refresh_token: newRefreshToken, expires_in } = refreshResponse.data;

      await db.sessions.updateTokens(userId, {
        access_token,
        refresh_token: newRefreshToken || session.refresh_token,
        token_expires_at: now + expires_in
      });

      return access_token;
    } catch (err) {
      console.error(`Token refresh failed for user ${userId}:`, err);
      throw new Error('403: OAUTH_REJECTED');
    }
  });
}
          

A practical token handling strategy should include:

Scenario	Action
Token is valid	Use existing token
Token expires soon	Refresh before making API call
Refresh token exists	Exchange refresh token for a new access token
Refresh fails or no refresh token	Ask user to re-authenticate (return re-auth response)
Token rejected by DigiLocker	Redirect user to SSO again

In a production system, error codes should also be meaningful to make frontend handling cleaner:

Code	Meaning	Recommended Action
`401`	DigiLocker not linked	Redirect user to SSO
`402`	Session expired or no refresh token	Redirect user to SSO
`403`	Token rejected by DigiLocker	Redirect user to SSO

Fetching Issued Documents from DigiLocker

Once document access is granted, the application can fetch the list of issued documents from DigiLocker.

The issued documents response typically includes: Document name, Document type code, Description, Issuer, URI, MIME type, and Issue date.

The technical document includes an issued documents flow where /digilocker/documents lists the documents available in the citizen’s DigiLocker account, including document metadata such as name, doctype, issuer, URI, MIME type, and date. This document list is important because field extraction should not blindly assume that a particular certificate is available.

A reliable implementation should follow this flow:

Fetch all available documents.
Match against configured document type (e.g. 10th marksheet, driving license).
Select the correct document URI.
Fetch XML or PDF based on configuration.
Extract required value and return structured response.

This approach makes the integration flexible across departments, workflows, and document types.

             fetch-documents.ts
            TypeScript / Express
          

            export async function getIssuedDocuments(req, res) {
  try {
    const accessToken = await getValidDigiLockerToken(req.userId);

    // Fetch the full index of all certificates and documents issued to this citizen
    const response = await axios.get('https://digilocker.gov.in/api/v1/documents/issued', {
      headers: { Authorization: `Bearer ${accessToken}` }
    });

    // Return list containing: name, doctype, issuer, uri, mime, date
    return res.json({
      success: true,
      documents: response.data.items
    });
  } catch (err) {
    const status = err.message.startsWith('402') ? 402 : 403;
    return res.status(status).json({ error: err.message });
  }
}
          

Field-Level Extraction from DigiLocker Documents

For many enterprise and government use cases, simply downloading a document is not enough. The application often needs to extract specific values, such as: Name, Date of birth, Certificate number, Father's name, Issuer, Education qualification, Marriage registration details, and Address fields.

This is where a configuration-driven extraction layer becomes valuable:

Configuration Field	Purpose
`doctype`	Document type code, such as education, Aadhaar, profile, or marriage
`xml_node`	XPath or node path to locate the data in the XML file
`xml_attribute`	Attribute name to extract from the XML node
`fetch_type`	Whether the system should fetch XML or PDF

The implementation also supports smart document chains for education and marriage documents. For example, the education chain can try multiple document types in order until a matching document is found.

Instead of hardcoding every document flow, the platform can allow administrators or workflow designers to configure which document type is required, which field should be extracted, which form field should receive the value, whether the source is profile, XML, or PDF, and whether fallback document chains should be used. At VAF.ai, this is the difference between a one-time API integration and a reusable integration framework.

             document-extractor.ts
            TypeScript / XML DOM / XPath
          

            import { DOMParser } from 'xmldom';
import xpath from 'xpath';

// Sample mapping configuration for Aadhaar and Marksheet extraction
const EXTRACTION_CONFIG = {
  AADHAAR: [
    { field: 'fullName', xpath: '//CertificateData/PersonalID/@Name' },
    { field: 'uid', xpath: '//CertificateData/PersonalID/@Number' },
    { field: 'dob', xpath: '//CertificateData/PersonalID/@DOB' }
  ],
  SSC_MARKSHEET: [
    { field: 'rollNo', xpath: '//Certificate/Data/Performance/@RollNumber' },
    { field: 'marks', xpath: '//Certificate/Data/Performance/Subject[@Code="085"]/Marks/@Theory' }
  ]
};

export function extractFields(xmlString: string, doctype: string) {
  const doc = new DOMParser().parseFromString(xmlString);
  const fields = EXTRACTION_CONFIG[doctype];
  const result = {};

  if (!fields) throw new Error(`No configuration found for doctype ${doctype}`);

  for (const config of fields) {
    const node = xpath.select(config.xpath, doc);
    if (node && node.length > 0) {
      // Attribute nodes or text node extraction
      result[config.field] = node[0].nodeValue || node[0].textContent;
    }
  }

  return result;
}
          

Downloading DigiLocker Files

Some use cases require the original document file, usually as a PDF (e.g., storing a certificate copy against an application for review by department officials).

The technical implementation includes a file download API where the system receives the document URI, calls the DigiLocker file endpoint, processes the base64 response, decodes it, and returns binary file bytes.

Important considerations in production:

Validate user session before file access.
Ensure the document belongs to the authenticated user.
Avoid exposing raw DigiLocker tokens to the frontend.
Store files only when required by compliance policy.
Track document source and verification metadata.
Handle base64 padding and decoding safely.
Apply access control for downloaded documents.

             download-file.ts
            TypeScript / Node.js
          

            export async function downloadDocumentFile(req, res) {
  const { documentUri } = req.query; // e.g., 'in.gov.uidai-ADHAR-12345678'

  try {
    const accessToken = await getValidDigiLockerToken(req.userId);

    // 1. Fetch file byte stream from DigiLocker File Endpoint
    const response = await axios.get(`https://digilocker.gov.in/api/v1/documents/file/${documentUri}`, {
      headers: { Authorization: `Bearer ${accessToken}` },
      responseType: 'arraybuffer' // Raw buffer handles binary/base64 output
    });

    // 2. Set headers and stream binary PDF directly to the client browser
    res.setHeader('Content-Type', 'application/pdf');
    res.setHeader('Content-Disposition', `attachment; filename="verified-document-${Date.now()}.pdf"`);
    
    return res.send(Buffer.from(response.data));
  } catch (err) {
    console.error('DigiLocker file download failed:', err);
    return res.status(500).json({ error: 'Failed to download document file' });
  }
}
          

Golden Record Auto-Sync

One of the strongest parts of a DigiLocker integration is verified profile synchronization. When a citizen logs in through Meri Pehchaan, the application can fetch verified profile data and sync it into a central citizen record. Each field is tagged with Meri Pehchaan as the source and marked as verified.

DigiLocker Field	Application Field
`name`	Citizen full name
`dob`	Date of birth
`gender`	Gender
`mobile`	Mobile number
`email`	Email
`digilockerid`	DigiLocker ID
`picture`	Profile picture

This is very useful for citizen service platforms. Instead of asking users to repeatedly enter the same personal details, the system can auto-fill verified information and reduce manual errors. For government workflows, this also helps build a consistent citizen profile across multiple services. This reduces manual effort, improves accuracy, and speeds up service delivery by protecting verified fields from accidental overwrites.

Common Production Challenges in DigiLocker Integration

Based on real implementation experience, these are the most common challenges teams face:

Incorrect OAuth Scope: Many teams start with only openid scope and later realize that they cannot access issued documents. The solution is to clearly separate standard login and document access login.
PKCE Misconfiguration: If the code verifier and code challenge are not generated correctly, token exchange fails. Use SHA-256 and base64url encoding correctly, and make sure the verifier used in callback is the same verifier generated during redirect.
Token Expiry Issues: Access tokens expire. If the application does not manage refresh properly, users will face random failures while fetching documents. Store expiry metadata and refresh the token before making DigiLocker calls.
Concurrent Refresh Race Conditions: If multiple API calls happen at the same time and all try to refresh the token, the session may become inconsistent. Use a per-user lock or similar concurrency control mechanism.
Hardcoded Document Mapping: Hardcoding document type and XML fields may work for one service but fails when multiple departments or forms need different document fields. Use a configuration-driven mapping layer.
Poor Re-Authentication UX: When token refresh fails, the user should not see a technical error. The frontend should receive a clear response and reopen the SSO flow.
XML/PDF Differences: Some documents may provide structured XML, while others may require PDF handling. Your extraction layer should support both modes.
Missing Verified Source Metadata: If synced data is not tagged with source and verification status, downstream systems cannot trust it properly. Always store source, timestamp, and verification status.

Production Readiness Checklist

Before going live with your DigiLocker and Meri Pehchaan integration, validate the following security, compliance, and user experience requirements:

OAuth and SSO

PKCE is enabled with S256
Redirect URI is correctly registered
State parameter is validated
Client secret is securely stored
Callback errors are handled properly

Token Management

Access token expiry is tracked
Refresh token flow is implemented
Failed refresh triggers re-auth
Concurrent refresh is controlled
Tokens stored securely in backend session metadata

Document Access

Separate auth and document scopes
Required doctypes are configurable
Fetch document list before file access
XML and PDF fetch modes supported
Document URI handling is secure

Field Extraction

XML node and attribute mapping is configurable
Profile fields are handled separately
Education and marriage fallback chains are supported
Missing document scenarios are handled gracefully

Golden Record Sync

Verified profile fields mapped correctly
Source metadata is stored
Synced fields marked as verified
Existing citizen records updated safely
Duplicate citizen records are avoided

Security and Compliance

DigiLocker tokens not exposed to client
Sensitive documents access-controlled
Comprehensive audit logs maintained
Consent-based access respected
File storage follows retention policy

User Experience

Re-authentication flow is smooth
Errors are user-friendly
Auto-filled fields are clearly shown
Verified fields protected from overwrite
Mobile and web journeys are tested

How VAF.ai Helps

DigiLocker integration is not just an API task. It is a complete identity, consent, document, and verified data workflow. VAF.ai helps enterprises and government-facing platforms implement this lifecycle faster with reusable integration patterns and configurable workflows.

Secure Meri Pehchaan SSO integration

OAuth 2.0 & PKCE auth flows

DigiLocker document access workflows

Automatic token lifecycle handling

Configurable XML/PDF field extraction

Smart document fallback chains

Golden Record integration

Workflow-based document mapping

Instead of building every DigiLocker use case from scratch, teams can use VAF.ai to create repeatable integration flows that work across departments, services, and business processes.

Example Use Case: Citizen Service Application

Consider a citizen applying for a certificate through a digital government platform.

Without DigiLocker Integration

Manually enter personal details
Upload identity proof manually
Upload education or supporting certificates
Wait for manual verification by backend teams
Correct errors if uploaded documents are unclear or invalid

With DigiLocker & VAF.ai

The citizen logs in securely using Meri Pehchaan
Verified profile data is fetched automatically
The application requests consent for required documents
Available DigiLocker documents are listed and selected
Required XML/PDF fields are extracted automatically
Verified data is synced into the citizen profile
The application form is auto-filled instantly
Department users can review trusted document data in real-time

This lifecycle-based approach reduces manual effort, improves accuracy, and speeds up service delivery.

Why a Lifecycle-Based Approach Matters

Many integrations fail because they are treated as one API call. But DigiLocker integration has multiple lifecycle stages that must be handled properly:

Stage	What Needs to Be Handled
Authentication	SSO, OAuth, PKCE, callback handling
Consent	Scope and document permissions management
Token	Access token, refresh token, expiry metadata
Discovery	List issued documents available in citizen's locker
Fetch	XML/PDF/file download and secure decryption
Extraction	Field-level mapping based on config templates
Sync	Golden Record updates with verified status tagging
Error Handling	Managing 401, 402, 403, and re-authentication redirects
Audit	Tracking source, verification status, and timestamps
Reuse	Configuring mappings across multiple department services

A lifecycle-based architecture makes the integration maintainable and scalable. That is the core value of VAF.ai.

Conclusion

DigiLocker integration can transform how platforms handle identity verification, document collection, and citizen data onboarding. But production-ready implementation requires more than basic API connectivity.

Whether you are building a citizen service platform, enterprise onboarding system, education verification flow, financial KYC workflow, or government digital service, VAF.ai can help you implement DigiLocker integration with a production-first approach.

Planning to integrate DigiLocker into your application?

VAF.ai can help you design and implement a secure, scalable, and production-ready DigiLocker integration with Meri Pehchaan SSO, document access, verified profile sync, and configurable field extraction.

Book a DigiLocker Integration Demo Review Production Checklist