Digital identity and verified document access are becoming core building blocks for modern citizen services, financial onboarding, education platforms, healthcare systems, and enterprise compliance workflows.

In India, DigiLocker and Meri Pehchaan play a major role in this transformation. DigiLocker enables citizens to access and share verified digital documents, while Meri Pehchaan provides a single sign-on (SSO) layer for secure authentication and consent-based access.

Deceptive Simplicity: For many teams, DigiLocker integration looks simple at the beginning: redirect the user, get authorization, fetch documents, and store the result. But in real production environments, the complexity starts much deeper.

You need to handle OAuth flows, PKCE security, token expiry, document scopes, XML/PDF extraction, citizen profile mapping, re-authentication scenarios, and verified data synchronization. Without the right architecture, teams often end up with broken user journeys, token failures, incomplete document mapping, and hardcoded integration logic that becomes difficult to maintain.

At VAF.ai, we approach DigiLocker integration as a complete lifecycle: Authenticate the citizen → collect consent → manage tokens → fetch profile/documents → extract verified fields → sync with business records → handle errors and re-authentication.

Why DigiLocker Integration Matters

DigiLocker is not just a document storage system. It is a trusted digital document exchange layer that helps platforms reduce manual uploads, improve verification accuracy, and build faster onboarding journeys.

For government and enterprise applications, DigiLocker integration can support use cases such as:

  • Citizen identity verification
  • Education certificate verification
  • Caste, income, residence, and other certificate access
  • Applicant profile auto-fill
  • KYC and onboarding
  • Document-based eligibility checks
  • Verified profile creation
  • Digital service delivery workflows

Instead of asking users to manually upload scanned copies, an application can request consent and fetch verified documents directly from DigiLocker. This creates a better user experience and improves data reliability.

However, to make this work in production, the integration must be secure, configurable, and resilient.

What is Meri Pehchaan?

Meri Pehchaan is the Government of India’s single sign-on platform built on DigiLocker. It supports Aadhaar-linked identity verification and document access through OAuth 2.0, OpenID Connect (OIDC), and PKCE-based authorization flows.

In a typical implementation, Meri Pehchaan acts as the authentication layer, while DigiLocker APIs provide access to profile and document data after user consent.

A secure implementation usually includes:

  • OAuth 2.0 authorization code flow
  • PKCE (Proof Key for Code Exchange) with S256 challenge method
  • OpenID Connect identity flow
  • Consent-based document access
  • Access token and refresh token handling
  • Profile and document API calls
  • Application-level session generation

A simple login integration may only need an openid scope. But if your application needs to access issued documents, additional scopes such as document access permissions are required. The integration must clearly separate authentication-only flows from document-access flows.

High-Level DigiLocker Integration Architecture

A production-grade architecture usually contains the following components to prevent tight coupling and ensure secure flows:

  • Client Application The web or mobile application where the citizen starts the login or document access flow.
  • IAM / Authentication Service The backend service that generates the Meri Pehchaan redirect URL, manages PKCE, handles callbacks, exchanges authorization codes, and issues local application tokens.
  • Meri Pehchaan Authorization Server The SSO provider that authenticates the user and returns the authorization code.
  • DigiLocker APIs APIs used to fetch the user profile, issued documents, XML data, and PDF files.
  • Application Session Store Stores local JWT sessions, refresh tokens, and metadata required for further DigiLocker access.
  • Golden Record / Business Record Store Stores verified citizen profile data and document-derived fields.
  • Document Mapping Layer Maps DigiLocker fields, XML nodes, and document attributes into application-specific fields.

This separation is important because DigiLocker integration should not be tightly coupled with one form, one service, or one hardcoded document type.

A scalable implementation should allow different services to configure which document type to fetch, which XML/PDF field to extract, and where that value should be mapped inside the application.

Step 1: Generate Meri Pehchaan SSO Redirect URL

The first step is to generate a secure SSO redirect URL.

In a production implementation, the backend should generate a PKCE code verifier and a corresponding code challenge. The client stores the verifier temporarily (e.g. in secure session cookie) and redirects the user to the Meri Pehchaan authorization URL.

The redirect request typically includes the following parameters:

Parameter Purpose
response_type Usually code
client_id OAuth client ID provided by DigiLocker / Meri Pehchaan
redirect_uri Registered callback URL where auth code is returned
state Client-generated state parameter for request validation (CSRF protection)
scope Requested scope, such as openid or files.issueddocs
code_challenge SHA-256 hashed PKCE challenge
code_challenge_method Usually S256

In the technical implementation, the SSO redirect API generates a Meri Pehchaan authorization URL using PKCE, where the code_verifier is created first, hashed using SHA-256, and sent as the code_challenge. This approach prevents authorization code interception attacks and is especially important for modern web and mobile applications.

sso-redirect.ts TypeScript / Node.js
import crypto from 'crypto';

interface RedirectParams {
  clientId: string;
  redirectUri: string;
  scope: string;
}

export function generateSSORedirectURL({ clientId, redirectUri, scope }: RedirectParams) {
  // 1. Generate secure random 43-character PKCE code_verifier
  const codeVerifier = crypto
    .randomBytes(32)
    .toString('base64url');

  // 2. Hash code_verifier using SHA-256 to create code_challenge
  const codeChallenge = crypto
    .createHash('sha256')
    .update(codeVerifier)
    .digest()
    .toString('base64url');

  // 3. Generate random state for CSRF validation
  const state = crypto.randomBytes(16).toString('hex');

  // 4. Build authorization URL
  const authUrl = new URL('https://meripehchaan.gov.in/oauth2/authorize');
  authUrl.searchParams.append('response_type', 'code');
  authUrl.searchParams.append('client_id', clientId);
  authUrl.searchParams.append('redirect_uri', redirectUri);
  authUrl.searchParams.append('state', state);
  authUrl.searchParams.append('scope', scope);
  authUrl.searchParams.append('code_challenge', codeChallenge);
  authUrl.searchParams.append('code_challenge_method', 'S256');

  return {
    url: authUrl.toString(),
    state,
    codeVerifier
  };
}

Step 2: Handle OAuth Callback and Token Exchange

After the citizen successfully authenticates through Meri Pehchaan, the browser is redirected back to your application callback URI with an authorization code.

The backend then performs the token exchange:

  1. Receive authorization code and state from callback request.
  2. Validate the state parameter against the stored state to prevent CSRF attacks.
  3. Send the authorization code and original PKCE verifier to the Meri Pehchaan token endpoint.
  4. Receive the access token and refresh token.
  5. Use the access token to fetch the DigiLocker user profile.
  6. Create or update the citizen record in your database.
  7. Issue local application JWT access and refresh tokens.

Design Best Practice: Your application should not expose raw DigiLocker tokens directly to the frontend for long-term use. Instead, the backend should manage secure token storage in session metadata and issue application-level tokens for internal access control.

sso-callback.ts TypeScript / Express
import axios from 'axios';

export async function handleCallback(req, res) {
  const { code, state } = req.query;
  
  // 1. Retrieve stored state and code verifier from secure session cookie
  const storedState = req.cookies.sso_state;
  const codeVerifier = req.cookies.sso_verifier;

  if (state !== storedState) {
    return res.status(400).json({ error: 'State validation failed (CSRF mismatch)' });
  }

  try {
    // 2. Exchange authorization code & PKCE verifier for tokens
    const tokenResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
      grant_type: 'authorization_code',
      code,
      client_id: process.env.DIGILOCKER_CLIENT_ID,
      client_secret: process.env.DIGILOCKER_CLIENT_SECRET,
      redirect_uri: process.env.DIGILOCKER_REDIRECT_URI,
      code_verifier: codeVerifier
    });

    const { access_token, refresh_token, expires_in } = tokenResponse.data;

    // 3. Fetch citizen profile from DigiLocker API
    const profileResponse = await axios.get('https://digilocker.gov.in/api/v1/profile', {
      headers: { Authorization: `Bearer ${access_token}` }
    });

    const citizen = profileResponse.data; // Contains name, dob, gender, digilockerid

    // 4. Upsert citizen in database (Golden Record) & issue local JWT
    const userSession = await db.citizens.upsertAndLogin({
      profile: citizen,
      tokens: { access_token, refresh_token, expires_in }
    });

    res.cookie('app_session', userSession.jwt, { httpOnly: true, secure: true });
    return res.redirect('/dashboard');
  } catch (error) {
    console.error('SSO exchange failed:', error);
    return res.redirect('/login?error=sso_failed');
  }
}

Authentication-Only vs Document Access Flow

A common mistake in DigiLocker integration is treating all login flows the same. In reality, you should separate standard login and document access login:

Flow Type Scope Purpose
Standard login openid Authenticate citizen and fetch profile. Used for logging into citizen dashboards.
Document access login openid files.issueddocs Authenticate citizen and access issued documents. Used for applying for services/certificates.

The technical implementation separates these flows using different endpoints: one for normal SSO redirect and another for document access redirect. The document access flow can also include an optional req_doctype parameter to preselect a specific document type.

By separating these flows, the application can reduce unnecessary consent requests and improve user experience.

Managing DigiLocker Token Lifecycle

Token lifecycle management is one of the most important parts of a production DigiLocker integration. A basic implementation may work during initial testing, but production users will face scenarios like:

  • Access token expired
  • Refresh token missing or failed
  • Multiple API calls trying to refresh the token at the same time (race condition)
  • DigiLocker rejecting the stored token

The technical implementation handles token lifecycle by storing DigiLocker tokens in session metadata and managing refresh logic automatically. It also uses a per-user async lock to avoid race conditions during concurrent refresh attempts.

token-lifecycle-manager.ts TypeScript / Redis / Node.js
import { Lock } from 'async-lock'; // Prevents race conditions across parallel requests

const lock = new Lock();

export async function getValidDigiLockerToken(userId: string): Promise<string> {
  // Lock token refresh execution per unique user ID
  return lock.acquire(userId, async () => {
    const session = await db.sessions.findByUserId(userId);
    
    if (!session || !session.refresh_token) {
      throw new Error('402: RE_AUTH_REQUIRED');
    }

    const now = Math.floor(Date.now() / 1000);
    // If token is valid for more than 5 minutes, return it
    if (session.token_expires_at - now > 300) {
      return session.access_token;
    }

    // If token is expired or expiring soon, execute OAuth 2.0 refresh
    try {
      const refreshResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
        grant_type: 'refresh_token',
        refresh_token: session.refresh_token,
        client_id: process.env.DIGILOCKER_CLIENT_ID,
        client_secret: process.env.DIGILOCKER_CLIENT_SECRET
      });

      const { access_token, refresh_token: newRefreshToken, expires_in } = refreshResponse.data;

      await db.sessions.updateTokens(userId, {
        access_token,
        refresh_token: newRefreshToken || session.refresh_token,
        token_expires_at: now + expires_in
      });

      return access_token;
    } catch (err) {
      console.error(`Token refresh failed for user ${userId}:`, err);
      throw new Error('403: OAUTH_REJECTED');
    }
  });
}

A practical token handling strategy should include:

Scenario Action
Token is valid Use existing token
Token expires soon Refresh before making API call
Refresh token exists Exchange refresh token for a new access token
Refresh fails or no refresh token Ask user to re-authenticate (return re-auth response)
Token rejected by DigiLocker Redirect user to SSO again

In a production system, error codes should also be meaningful to make frontend handling cleaner:

Code Meaning Recommended Action
401 DigiLocker not linked Redirect user to SSO
402 Session expired or no refresh token Redirect user to SSO
403 Token rejected by DigiLocker Redirect user to SSO

Fetching Issued Documents from DigiLocker

Once document access is granted, the application can fetch the list of issued documents from DigiLocker.

The issued documents response typically includes: Document name, Document type code, Description, Issuer, URI, MIME type, and Issue date.

The technical document includes an issued documents flow where /digilocker/documents lists the documents available in the citizen’s DigiLocker account, including document metadata such as name, doctype, issuer, URI, MIME type, and date. This document list is important because field extraction should not blindly assume that a particular certificate is available.

A reliable implementation should follow this flow:

  1. Fetch all available documents.
  2. Match against configured document type (e.g. 10th marksheet, driving license).
  3. Select the correct document URI.
  4. Fetch XML or PDF based on configuration.
  5. Extract required value and return structured response.

This approach makes the integration flexible across departments, workflows, and document types.

fetch-documents.ts TypeScript / Express
export async function getIssuedDocuments(req, res) {
  try {
    const accessToken = await getValidDigiLockerToken(req.userId);

    // Fetch the full index of all certificates and documents issued to this citizen
    const response = await axios.get('https://digilocker.gov.in/api/v1/documents/issued', {
      headers: { Authorization: `Bearer ${accessToken}` }
    });

    // Return list containing: name, doctype, issuer, uri, mime, date
    return res.json({
      success: true,
      documents: response.data.items
    });
  } catch (err) {
    const status = err.message.startsWith('402') ? 402 : 403;
    return res.status(status).json({ error: err.message });
  }
}

Field-Level Extraction from DigiLocker Documents

For many enterprise and government use cases, simply downloading a document is not enough. The application often needs to extract specific values, such as: Name, Date of birth, Certificate number, Father's name, Issuer, Education qualification, Marriage registration details, and Address fields.

This is where a configuration-driven extraction layer becomes valuable:

Configuration Field Purpose
doctype Document type code, such as education, Aadhaar, profile, or marriage
xml_node XPath or node path to locate the data in the XML file
xml_attribute Attribute name to extract from the XML node
fetch_type Whether the system should fetch XML or PDF

The implementation also supports smart document chains for education and marriage documents. For example, the education chain can try multiple document types in order until a matching document is found.

Instead of hardcoding every document flow, the platform can allow administrators or workflow designers to configure which document type is required, which field should be extracted, which form field should receive the value, whether the source is profile, XML, or PDF, and whether fallback document chains should be used. At VAF.ai, this is the difference between a one-time API integration and a reusable integration framework.

document-extractor.ts TypeScript / XML DOM / XPath
import { DOMParser } from 'xmldom';
import xpath from 'xpath';

// Sample mapping configuration for Aadhaar and Marksheet extraction
const EXTRACTION_CONFIG = {
  AADHAAR: [
    { field: 'fullName', xpath: '//CertificateData/PersonalID/@Name' },
    { field: 'uid', xpath: '//CertificateData/PersonalID/@Number' },
    { field: 'dob', xpath: '//CertificateData/PersonalID/@DOB' }
  ],
  SSC_MARKSHEET: [
    { field: 'rollNo', xpath: '//Certificate/Data/Performance/@RollNumber' },
    { field: 'marks', xpath: '//Certificate/Data/Performance/Subject[@Code="085"]/Marks/@Theory' }
  ]
};

export function extractFields(xmlString: string, doctype: string) {
  const doc = new DOMParser().parseFromString(xmlString);
  const fields = EXTRACTION_CONFIG[doctype];
  const result = {};

  if (!fields) throw new Error(`No configuration found for doctype ${doctype}`);

  for (const config of fields) {
    const node = xpath.select(config.xpath, doc);
    if (node && node.length > 0) {
      // Attribute nodes or text node extraction
      result[config.field] = node[0].nodeValue || node[0].textContent;
    }
  }

  return result;
}

Downloading DigiLocker Files

Some use cases require the original document file, usually as a PDF (e.g., storing a certificate copy against an application for review by department officials).

The technical implementation includes a file download API where the system receives the document URI, calls the DigiLocker file endpoint, processes the base64 response, decodes it, and returns binary file bytes.

Important considerations in production:

  • Validate user session before file access.
  • Ensure the document belongs to the authenticated user.
  • Avoid exposing raw DigiLocker tokens to the frontend.
  • Store files only when required by compliance policy.
  • Track document source and verification metadata.
  • Handle base64 padding and decoding safely.
  • Apply access control for downloaded documents.
download-file.ts TypeScript / Node.js
export async function downloadDocumentFile(req, res) {
  const { documentUri } = req.query; // e.g., 'in.gov.uidai-ADHAR-12345678'

  try {
    const accessToken = await getValidDigiLockerToken(req.userId);

    // 1. Fetch file byte stream from DigiLocker File Endpoint
    const response = await axios.get(`https://digilocker.gov.in/api/v1/documents/file/${documentUri}`, {
      headers: { Authorization: `Bearer ${accessToken}` },
      responseType: 'arraybuffer' // Raw buffer handles binary/base64 output
    });

    // 2. Set headers and stream binary PDF directly to the client browser
    res.setHeader('Content-Type', 'application/pdf');
    res.setHeader('Content-Disposition', `attachment; filename="verified-document-${Date.now()}.pdf"`);
    
    return res.send(Buffer.from(response.data));
  } catch (err) {
    console.error('DigiLocker file download failed:', err);
    return res.status(500).json({ error: 'Failed to download document file' });
  }
}

Golden Record Auto-Sync

One of the strongest parts of a DigiLocker integration is verified profile synchronization. When a citizen logs in through Meri Pehchaan, the application can fetch verified profile data and sync it into a central citizen record. Each field is tagged with Meri Pehchaan as the source and marked as verified.

DigiLocker Field Application Field
name Citizen full name
dob Date of birth
gender Gender
mobile Mobile number
email Email
digilockerid DigiLocker ID
picture Profile picture

This is very useful for citizen service platforms. Instead of asking users to repeatedly enter the same personal details, the system can auto-fill verified information and reduce manual errors. For government workflows, this also helps build a consistent citizen profile across multiple services. This reduces manual effort, improves accuracy, and speeds up service delivery by protecting verified fields from accidental overwrites.

Common Production Challenges in DigiLocker Integration

Based on real implementation experience, these are the most common challenges teams face:

  1. Incorrect OAuth Scope: Many teams start with only openid scope and later realize that they cannot access issued documents. The solution is to clearly separate standard login and document access login.
  2. PKCE Misconfiguration: If the code verifier and code challenge are not generated correctly, token exchange fails. Use SHA-256 and base64url encoding correctly, and make sure the verifier used in callback is the same verifier generated during redirect.
  3. Token Expiry Issues: Access tokens expire. If the application does not manage refresh properly, users will face random failures while fetching documents. Store expiry metadata and refresh the token before making DigiLocker calls.
  4. Concurrent Refresh Race Conditions: If multiple API calls happen at the same time and all try to refresh the token, the session may become inconsistent. Use a per-user lock or similar concurrency control mechanism.
  5. Hardcoded Document Mapping: Hardcoding document type and XML fields may work for one service but fails when multiple departments or forms need different document fields. Use a configuration-driven mapping layer.
  6. Poor Re-Authentication UX: When token refresh fails, the user should not see a technical error. The frontend should receive a clear response and reopen the SSO flow.
  7. XML/PDF Differences: Some documents may provide structured XML, while others may require PDF handling. Your extraction layer should support both modes.
  8. Missing Verified Source Metadata: If synced data is not tagged with source and verification status, downstream systems cannot trust it properly. Always store source, timestamp, and verification status.
Production Readiness Checklist

Before going live with your DigiLocker and Meri Pehchaan integration, validate the following security, compliance, and user experience requirements:

OAuth and SSO

  • PKCE is enabled with S256
  • Redirect URI is correctly registered
  • State parameter is validated
  • Client secret is securely stored
  • Callback errors are handled properly

Token Management

  • Access token expiry is tracked
  • Refresh token flow is implemented
  • Failed refresh triggers re-auth
  • Concurrent refresh is controlled
  • Tokens stored securely in backend session metadata

Document Access

  • Separate auth and document scopes
  • Required doctypes are configurable
  • Fetch document list before file access
  • XML and PDF fetch modes supported
  • Document URI handling is secure

Field Extraction

  • XML node and attribute mapping is configurable
  • Profile fields are handled separately
  • Education and marriage fallback chains are supported
  • Missing document scenarios are handled gracefully

Golden Record Sync

  • Verified profile fields mapped correctly
  • Source metadata is stored
  • Synced fields marked as verified
  • Existing citizen records updated safely
  • Duplicate citizen records are avoided

Security and Compliance

  • DigiLocker tokens not exposed to client
  • Sensitive documents access-controlled
  • Comprehensive audit logs maintained
  • Consent-based access respected
  • File storage follows retention policy

User Experience

  • Re-authentication flow is smooth
  • Errors are user-friendly
  • Auto-filled fields are clearly shown
  • Verified fields protected from overwrite
  • Mobile and web journeys are tested

How VAF.ai Helps

DigiLocker integration is not just an API task. It is a complete identity, consent, document, and verified data workflow. VAF.ai helps enterprises and government-facing platforms implement this lifecycle faster with reusable integration patterns and configurable workflows.

Secure Meri Pehchaan SSO integration
OAuth 2.0 & PKCE auth flows
DigiLocker document access workflows
Automatic token lifecycle handling
Configurable XML/PDF field extraction
Smart document fallback chains
Golden Record integration
Workflow-based document mapping

Instead of building every DigiLocker use case from scratch, teams can use VAF.ai to create repeatable integration flows that work across departments, services, and business processes.

Example Use Case: Citizen Service Application

Consider a citizen applying for a certificate through a digital government platform.

Without DigiLocker Integration

  • Manually enter personal details
  • Upload identity proof manually
  • Upload education or supporting certificates
  • Wait for manual verification by backend teams
  • Correct errors if uploaded documents are unclear or invalid

With DigiLocker & VAF.ai

  • The citizen logs in securely using Meri Pehchaan
  • Verified profile data is fetched automatically
  • The application requests consent for required documents
  • Available DigiLocker documents are listed and selected
  • Required XML/PDF fields are extracted automatically
  • Verified data is synced into the citizen profile
  • The application form is auto-filled instantly
  • Department users can review trusted document data in real-time

This lifecycle-based approach reduces manual effort, improves accuracy, and speeds up service delivery.

Why a Lifecycle-Based Approach Matters

Many integrations fail because they are treated as one API call. But DigiLocker integration has multiple lifecycle stages that must be handled properly:

Stage What Needs to Be Handled
Authentication SSO, OAuth, PKCE, callback handling
Consent Scope and document permissions management
Token Access token, refresh token, expiry metadata
Discovery List issued documents available in citizen's locker
Fetch XML/PDF/file download and secure decryption
Extraction Field-level mapping based on config templates
Sync Golden Record updates with verified status tagging
Error Handling Managing 401, 402, 403, and re-authentication redirects
Audit Tracking source, verification status, and timestamps
Reuse Configuring mappings across multiple department services

A lifecycle-based architecture makes the integration maintainable and scalable. That is the core value of VAF.ai.

Conclusion

DigiLocker integration can transform how platforms handle identity verification, document collection, and citizen data onboarding. But production-ready implementation requires more than basic API connectivity.

Whether you are building a citizen service platform, enterprise onboarding system, education verification flow, financial KYC workflow, or government digital service, VAF.ai can help you implement DigiLocker integration with a production-first approach.

Planning to integrate DigiLocker into your application?

VAF.ai can help you design and implement a secure, scalable, and production-ready DigiLocker integration with Meri Pehchaan SSO, document access, verified profile sync, and configurable field extraction.