Digital identity and verified document access are becoming core building blocks for modern citizen services, financial onboarding, education platforms, healthcare systems, and enterprise compliance workflows.
In India, DigiLocker and Meri Pehchaan play a major role in this transformation. DigiLocker enables citizens to access and share verified digital documents, while Meri Pehchaan provides a single sign-on (SSO) layer for secure authentication and consent-based access.
Deceptive Simplicity: For many teams, DigiLocker integration looks simple at the beginning: redirect the user, get authorization, fetch documents, and store the result. But in real production environments, the complexity starts much deeper.
You need to handle OAuth flows, PKCE security, token expiry, document scopes, XML/PDF extraction, citizen profile mapping, re-authentication scenarios, and verified data synchronization. Without the right architecture, teams often end up with broken user journeys, token failures, incomplete document mapping, and hardcoded integration logic that becomes difficult to maintain.
At VAF.ai, we approach DigiLocker integration as a complete lifecycle: Authenticate the citizen → collect consent → manage tokens → fetch profile/documents → extract verified fields → sync with business records → handle errors and re-authentication.
Why DigiLocker Integration Matters
DigiLocker is not just a document storage system. It is a trusted digital document exchange layer that helps platforms reduce manual uploads, improve verification accuracy, and build faster onboarding journeys.
For government and enterprise applications, DigiLocker integration can support use cases such as:
- Citizen identity verification
- Education certificate verification
- Caste, income, residence, and other certificate access
- Applicant profile auto-fill
- KYC and onboarding
- Document-based eligibility checks
- Verified profile creation
- Digital service delivery workflows
Instead of asking users to manually upload scanned copies, an application can request consent and fetch verified documents directly from DigiLocker. This creates a better user experience and improves data reliability.
However, to make this work in production, the integration must be secure, configurable, and resilient.
What is Meri Pehchaan?
Meri Pehchaan is the Government of India’s single sign-on platform built on DigiLocker. It supports Aadhaar-linked identity verification and document access through OAuth 2.0, OpenID Connect (OIDC), and PKCE-based authorization flows.
In a typical implementation, Meri Pehchaan acts as the authentication layer, while DigiLocker APIs provide access to profile and document data after user consent.
A secure implementation usually includes:
- OAuth 2.0 authorization code flow
- PKCE (Proof Key for Code Exchange) with S256 challenge method
- OpenID Connect identity flow
- Consent-based document access
- Access token and refresh token handling
- Profile and document API calls
- Application-level session generation
A simple login integration may only need an openid scope. But if your application needs to access issued documents, additional scopes such as document access permissions are required. The integration must clearly separate authentication-only flows from document-access flows.
High-Level DigiLocker Integration Architecture
A production-grade architecture usually contains the following components to prevent tight coupling and ensure secure flows:
- Client Application The web or mobile application where the citizen starts the login or document access flow.
- IAM / Authentication Service The backend service that generates the Meri Pehchaan redirect URL, manages PKCE, handles callbacks, exchanges authorization codes, and issues local application tokens.
- Meri Pehchaan Authorization Server The SSO provider that authenticates the user and returns the authorization code.
- DigiLocker APIs APIs used to fetch the user profile, issued documents, XML data, and PDF files.
- Application Session Store Stores local JWT sessions, refresh tokens, and metadata required for further DigiLocker access.
- Golden Record / Business Record Store Stores verified citizen profile data and document-derived fields.
- Document Mapping Layer Maps DigiLocker fields, XML nodes, and document attributes into application-specific fields.
This separation is important because DigiLocker integration should not be tightly coupled with one form, one service, or one hardcoded document type.
A scalable implementation should allow different services to configure which document type to fetch, which XML/PDF field to extract, and where that value should be mapped inside the application.
Step 1: Generate Meri Pehchaan SSO Redirect URL
The first step is to generate a secure SSO redirect URL.
In a production implementation, the backend should generate a PKCE code verifier and a corresponding code challenge. The client stores the verifier temporarily (e.g. in secure session cookie) and redirects the user to the Meri Pehchaan authorization URL.
The redirect request typically includes the following parameters:
| Parameter | Purpose |
|---|---|
response_type |
Usually code |
client_id |
OAuth client ID provided by DigiLocker / Meri Pehchaan |
redirect_uri |
Registered callback URL where auth code is returned |
state |
Client-generated state parameter for request validation (CSRF protection) |
scope |
Requested scope, such as openid or files.issueddocs |
code_challenge |
SHA-256 hashed PKCE challenge |
code_challenge_method |
Usually S256 |
In the technical implementation, the SSO redirect API generates a Meri Pehchaan authorization URL using PKCE, where the code_verifier is created first, hashed using SHA-256, and sent as the code_challenge. This approach prevents authorization code interception attacks and is especially important for modern web and mobile applications.
import crypto from 'crypto';
interface RedirectParams {
clientId: string;
redirectUri: string;
scope: string;
}
export function generateSSORedirectURL({ clientId, redirectUri, scope }: RedirectParams) {
// 1. Generate secure random 43-character PKCE code_verifier
const codeVerifier = crypto
.randomBytes(32)
.toString('base64url');
// 2. Hash code_verifier using SHA-256 to create code_challenge
const codeChallenge = crypto
.createHash('sha256')
.update(codeVerifier)
.digest()
.toString('base64url');
// 3. Generate random state for CSRF validation
const state = crypto.randomBytes(16).toString('hex');
// 4. Build authorization URL
const authUrl = new URL('https://meripehchaan.gov.in/oauth2/authorize');
authUrl.searchParams.append('response_type', 'code');
authUrl.searchParams.append('client_id', clientId);
authUrl.searchParams.append('redirect_uri', redirectUri);
authUrl.searchParams.append('state', state);
authUrl.searchParams.append('scope', scope);
authUrl.searchParams.append('code_challenge', codeChallenge);
authUrl.searchParams.append('code_challenge_method', 'S256');
return {
url: authUrl.toString(),
state,
codeVerifier
};
}
Step 2: Handle OAuth Callback and Token Exchange
After the citizen successfully authenticates through Meri Pehchaan, the browser is redirected back to your application callback URI with an authorization code.
The backend then performs the token exchange:
- Receive authorization code and state from callback request.
- Validate the state parameter against the stored state to prevent CSRF attacks.
- Send the authorization code and original PKCE verifier to the Meri Pehchaan token endpoint.
- Receive the access token and refresh token.
- Use the access token to fetch the DigiLocker user profile.
- Create or update the citizen record in your database.
- Issue local application JWT access and refresh tokens.
Design Best Practice: Your application should not expose raw DigiLocker tokens directly to the frontend for long-term use. Instead, the backend should manage secure token storage in session metadata and issue application-level tokens for internal access control.
import axios from 'axios';
export async function handleCallback(req, res) {
const { code, state } = req.query;
// 1. Retrieve stored state and code verifier from secure session cookie
const storedState = req.cookies.sso_state;
const codeVerifier = req.cookies.sso_verifier;
if (state !== storedState) {
return res.status(400).json({ error: 'State validation failed (CSRF mismatch)' });
}
try {
// 2. Exchange authorization code & PKCE verifier for tokens
const tokenResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
grant_type: 'authorization_code',
code,
client_id: process.env.DIGILOCKER_CLIENT_ID,
client_secret: process.env.DIGILOCKER_CLIENT_SECRET,
redirect_uri: process.env.DIGILOCKER_REDIRECT_URI,
code_verifier: codeVerifier
});
const { access_token, refresh_token, expires_in } = tokenResponse.data;
// 3. Fetch citizen profile from DigiLocker API
const profileResponse = await axios.get('https://digilocker.gov.in/api/v1/profile', {
headers: { Authorization: `Bearer ${access_token}` }
});
const citizen = profileResponse.data; // Contains name, dob, gender, digilockerid
// 4. Upsert citizen in database (Golden Record) & issue local JWT
const userSession = await db.citizens.upsertAndLogin({
profile: citizen,
tokens: { access_token, refresh_token, expires_in }
});
res.cookie('app_session', userSession.jwt, { httpOnly: true, secure: true });
return res.redirect('/dashboard');
} catch (error) {
console.error('SSO exchange failed:', error);
return res.redirect('/login?error=sso_failed');
}
}
Authentication-Only vs Document Access Flow
A common mistake in DigiLocker integration is treating all login flows the same. In reality, you should separate standard login and document access login:
| Flow Type | Scope | Purpose |
|---|---|---|
| Standard login | openid |
Authenticate citizen and fetch profile. Used for logging into citizen dashboards. |
| Document access login | openid files.issueddocs |
Authenticate citizen and access issued documents. Used for applying for services/certificates. |
The technical implementation separates these flows using different endpoints: one for normal SSO redirect and another for document access redirect. The document access flow can also include an optional req_doctype parameter to preselect a specific document type.
By separating these flows, the application can reduce unnecessary consent requests and improve user experience.
Managing DigiLocker Token Lifecycle
Token lifecycle management is one of the most important parts of a production DigiLocker integration. A basic implementation may work during initial testing, but production users will face scenarios like:
- Access token expired
- Refresh token missing or failed
- Multiple API calls trying to refresh the token at the same time (race condition)
- DigiLocker rejecting the stored token
The technical implementation handles token lifecycle by storing DigiLocker tokens in session metadata and managing refresh logic automatically. It also uses a per-user async lock to avoid race conditions during concurrent refresh attempts.
import { Lock } from 'async-lock'; // Prevents race conditions across parallel requests
const lock = new Lock();
export async function getValidDigiLockerToken(userId: string): Promise<string> {
// Lock token refresh execution per unique user ID
return lock.acquire(userId, async () => {
const session = await db.sessions.findByUserId(userId);
if (!session || !session.refresh_token) {
throw new Error('402: RE_AUTH_REQUIRED');
}
const now = Math.floor(Date.now() / 1000);
// If token is valid for more than 5 minutes, return it
if (session.token_expires_at - now > 300) {
return session.access_token;
}
// If token is expired or expiring soon, execute OAuth 2.0 refresh
try {
const refreshResponse = await axios.post('https://meripehchaan.gov.in/oauth2/token', {
grant_type: 'refresh_token',
refresh_token: session.refresh_token,
client_id: process.env.DIGILOCKER_CLIENT_ID,
client_secret: process.env.DIGILOCKER_CLIENT_SECRET
});
const { access_token, refresh_token: newRefreshToken, expires_in } = refreshResponse.data;
await db.sessions.updateTokens(userId, {
access_token,
refresh_token: newRefreshToken || session.refresh_token,
token_expires_at: now + expires_in
});
return access_token;
} catch (err) {
console.error(`Token refresh failed for user ${userId}:`, err);
throw new Error('403: OAUTH_REJECTED');
}
});
}
A practical token handling strategy should include:
| Scenario | Action |
|---|---|
| Token is valid | Use existing token |
| Token expires soon | Refresh before making API call |
| Refresh token exists | Exchange refresh token for a new access token |
| Refresh fails or no refresh token | Ask user to re-authenticate (return re-auth response) |
| Token rejected by DigiLocker | Redirect user to SSO again |
In a production system, error codes should also be meaningful to make frontend handling cleaner:
| Code | Meaning | Recommended Action |
|---|---|---|
401 |
DigiLocker not linked | Redirect user to SSO |
402 |
Session expired or no refresh token | Redirect user to SSO |
403 |
Token rejected by DigiLocker | Redirect user to SSO |
Fetching Issued Documents from DigiLocker
Once document access is granted, the application can fetch the list of issued documents from DigiLocker.
The issued documents response typically includes: Document name, Document type code, Description, Issuer, URI, MIME type, and Issue date.
The technical document includes an issued documents flow where /digilocker/documents lists the documents available in the citizen’s DigiLocker account, including document metadata such as name, doctype, issuer, URI, MIME type, and date. This document list is important because field extraction should not blindly assume that a particular certificate is available.
A reliable implementation should follow this flow:
- Fetch all available documents.
- Match against configured document type (e.g. 10th marksheet, driving license).
- Select the correct document URI.
- Fetch XML or PDF based on configuration.
- Extract required value and return structured response.
This approach makes the integration flexible across departments, workflows, and document types.
export async function getIssuedDocuments(req, res) {
try {
const accessToken = await getValidDigiLockerToken(req.userId);
// Fetch the full index of all certificates and documents issued to this citizen
const response = await axios.get('https://digilocker.gov.in/api/v1/documents/issued', {
headers: { Authorization: `Bearer ${accessToken}` }
});
// Return list containing: name, doctype, issuer, uri, mime, date
return res.json({
success: true,
documents: response.data.items
});
} catch (err) {
const status = err.message.startsWith('402') ? 402 : 403;
return res.status(status).json({ error: err.message });
}
}
Field-Level Extraction from DigiLocker Documents
For many enterprise and government use cases, simply downloading a document is not enough. The application often needs to extract specific values, such as: Name, Date of birth, Certificate number, Father's name, Issuer, Education qualification, Marriage registration details, and Address fields.
This is where a configuration-driven extraction layer becomes valuable:
| Configuration Field | Purpose |
|---|---|
doctype |
Document type code, such as education, Aadhaar, profile, or marriage |
xml_node |
XPath or node path to locate the data in the XML file |
xml_attribute |
Attribute name to extract from the XML node |
fetch_type |
Whether the system should fetch XML or PDF |
The implementation also supports smart document chains for education and marriage documents. For example, the education chain can try multiple document types in order until a matching document is found.
Instead of hardcoding every document flow, the platform can allow administrators or workflow designers to configure which document type is required, which field should be extracted, which form field should receive the value, whether the source is profile, XML, or PDF, and whether fallback document chains should be used. At VAF.ai, this is the difference between a one-time API integration and a reusable integration framework.
import { DOMParser } from 'xmldom';
import xpath from 'xpath';
// Sample mapping configuration for Aadhaar and Marksheet extraction
const EXTRACTION_CONFIG = {
AADHAAR: [
{ field: 'fullName', xpath: '//CertificateData/PersonalID/@Name' },
{ field: 'uid', xpath: '//CertificateData/PersonalID/@Number' },
{ field: 'dob', xpath: '//CertificateData/PersonalID/@DOB' }
],
SSC_MARKSHEET: [
{ field: 'rollNo', xpath: '//Certificate/Data/Performance/@RollNumber' },
{ field: 'marks', xpath: '//Certificate/Data/Performance/Subject[@Code="085"]/Marks/@Theory' }
]
};
export function extractFields(xmlString: string, doctype: string) {
const doc = new DOMParser().parseFromString(xmlString);
const fields = EXTRACTION_CONFIG[doctype];
const result = {};
if (!fields) throw new Error(`No configuration found for doctype ${doctype}`);
for (const config of fields) {
const node = xpath.select(config.xpath, doc);
if (node && node.length > 0) {
// Attribute nodes or text node extraction
result[config.field] = node[0].nodeValue || node[0].textContent;
}
}
return result;
}
Downloading DigiLocker Files
Some use cases require the original document file, usually as a PDF (e.g., storing a certificate copy against an application for review by department officials).
The technical implementation includes a file download API where the system receives the document URI, calls the DigiLocker file endpoint, processes the base64 response, decodes it, and returns binary file bytes.
Important considerations in production:
- Validate user session before file access.
- Ensure the document belongs to the authenticated user.
- Avoid exposing raw DigiLocker tokens to the frontend.
- Store files only when required by compliance policy.
- Track document source and verification metadata.
- Handle base64 padding and decoding safely.
- Apply access control for downloaded documents.
export async function downloadDocumentFile(req, res) {
const { documentUri } = req.query; // e.g., 'in.gov.uidai-ADHAR-12345678'
try {
const accessToken = await getValidDigiLockerToken(req.userId);
// 1. Fetch file byte stream from DigiLocker File Endpoint
const response = await axios.get(`https://digilocker.gov.in/api/v1/documents/file/${documentUri}`, {
headers: { Authorization: `Bearer ${accessToken}` },
responseType: 'arraybuffer' // Raw buffer handles binary/base64 output
});
// 2. Set headers and stream binary PDF directly to the client browser
res.setHeader('Content-Type', 'application/pdf');
res.setHeader('Content-Disposition', `attachment; filename="verified-document-${Date.now()}.pdf"`);
return res.send(Buffer.from(response.data));
} catch (err) {
console.error('DigiLocker file download failed:', err);
return res.status(500).json({ error: 'Failed to download document file' });
}
}
Golden Record Auto-Sync
One of the strongest parts of a DigiLocker integration is verified profile synchronization. When a citizen logs in through Meri Pehchaan, the application can fetch verified profile data and sync it into a central citizen record. Each field is tagged with Meri Pehchaan as the source and marked as verified.
| DigiLocker Field | Application Field |
|---|---|
name |
Citizen full name |
dob |
Date of birth |
gender |
Gender |
mobile |
Mobile number |
email |
|
digilockerid |
DigiLocker ID |
picture |
Profile picture |
This is very useful for citizen service platforms. Instead of asking users to repeatedly enter the same personal details, the system can auto-fill verified information and reduce manual errors. For government workflows, this also helps build a consistent citizen profile across multiple services. This reduces manual effort, improves accuracy, and speeds up service delivery by protecting verified fields from accidental overwrites.
Common Production Challenges in DigiLocker Integration
Based on real implementation experience, these are the most common challenges teams face:
- Incorrect OAuth Scope: Many teams start with only
openidscope and later realize that they cannot access issued documents. The solution is to clearly separate standard login and document access login. - PKCE Misconfiguration: If the code verifier and code challenge are not generated correctly, token exchange fails. Use SHA-256 and base64url encoding correctly, and make sure the verifier used in callback is the same verifier generated during redirect.
- Token Expiry Issues: Access tokens expire. If the application does not manage refresh properly, users will face random failures while fetching documents. Store expiry metadata and refresh the token before making DigiLocker calls.
- Concurrent Refresh Race Conditions: If multiple API calls happen at the same time and all try to refresh the token, the session may become inconsistent. Use a per-user lock or similar concurrency control mechanism.
- Hardcoded Document Mapping: Hardcoding document type and XML fields may work for one service but fails when multiple departments or forms need different document fields. Use a configuration-driven mapping layer.
- Poor Re-Authentication UX: When token refresh fails, the user should not see a technical error. The frontend should receive a clear response and reopen the SSO flow.
- XML/PDF Differences: Some documents may provide structured XML, while others may require PDF handling. Your extraction layer should support both modes.
- Missing Verified Source Metadata: If synced data is not tagged with source and verification status, downstream systems cannot trust it properly. Always store source, timestamp, and verification status.
Before going live with your DigiLocker and Meri Pehchaan integration, validate the following security, compliance, and user experience requirements:
OAuth and SSO
- PKCE is enabled with S256
- Redirect URI is correctly registered
- State parameter is validated
- Client secret is securely stored
- Callback errors are handled properly
Token Management
- Access token expiry is tracked
- Refresh token flow is implemented
- Failed refresh triggers re-auth
- Concurrent refresh is controlled
- Tokens stored securely in backend session metadata
Document Access
- Separate auth and document scopes
- Required doctypes are configurable
- Fetch document list before file access
- XML and PDF fetch modes supported
- Document URI handling is secure
Field Extraction
- XML node and attribute mapping is configurable
- Profile fields are handled separately
- Education and marriage fallback chains are supported
- Missing document scenarios are handled gracefully
Golden Record Sync
- Verified profile fields mapped correctly
- Source metadata is stored
- Synced fields marked as verified
- Existing citizen records updated safely
- Duplicate citizen records are avoided
Security and Compliance
- DigiLocker tokens not exposed to client
- Sensitive documents access-controlled
- Comprehensive audit logs maintained
- Consent-based access respected
- File storage follows retention policy
User Experience
- Re-authentication flow is smooth
- Errors are user-friendly
- Auto-filled fields are clearly shown
- Verified fields protected from overwrite
- Mobile and web journeys are tested
How VAF.ai Helps
DigiLocker integration is not just an API task. It is a complete identity, consent, document, and verified data workflow. VAF.ai helps enterprises and government-facing platforms implement this lifecycle faster with reusable integration patterns and configurable workflows.
Instead of building every DigiLocker use case from scratch, teams can use VAF.ai to create repeatable integration flows that work across departments, services, and business processes.
Example Use Case: Citizen Service Application
Consider a citizen applying for a certificate through a digital government platform.
Without DigiLocker Integration
- Manually enter personal details
- Upload identity proof manually
- Upload education or supporting certificates
- Wait for manual verification by backend teams
- Correct errors if uploaded documents are unclear or invalid
With DigiLocker & VAF.ai
- The citizen logs in securely using Meri Pehchaan
- Verified profile data is fetched automatically
- The application requests consent for required documents
- Available DigiLocker documents are listed and selected
- Required XML/PDF fields are extracted automatically
- Verified data is synced into the citizen profile
- The application form is auto-filled instantly
- Department users can review trusted document data in real-time
This lifecycle-based approach reduces manual effort, improves accuracy, and speeds up service delivery.
Why a Lifecycle-Based Approach Matters
Many integrations fail because they are treated as one API call. But DigiLocker integration has multiple lifecycle stages that must be handled properly:
| Stage | What Needs to Be Handled |
|---|---|
| Authentication | SSO, OAuth, PKCE, callback handling |
| Consent | Scope and document permissions management |
| Token | Access token, refresh token, expiry metadata |
| Discovery | List issued documents available in citizen's locker |
| Fetch | XML/PDF/file download and secure decryption |
| Extraction | Field-level mapping based on config templates |
| Sync | Golden Record updates with verified status tagging |
| Error Handling | Managing 401, 402, 403, and re-authentication redirects |
| Audit | Tracking source, verification status, and timestamps |
| Reuse | Configuring mappings across multiple department services |
A lifecycle-based architecture makes the integration maintainable and scalable. That is the core value of VAF.ai.
Conclusion
DigiLocker integration can transform how platforms handle identity verification, document collection, and citizen data onboarding. But production-ready implementation requires more than basic API connectivity.
Whether you are building a citizen service platform, enterprise onboarding system, education verification flow, financial KYC workflow, or government digital service, VAF.ai can help you implement DigiLocker integration with a production-first approach.
Planning to integrate DigiLocker into your application?
VAF.ai can help you design and implement a secure, scalable, and production-ready DigiLocker integration with Meri Pehchaan SSO, document access, verified profile sync, and configurable field extraction.