Prisma Access: Operations, Monitoring & Troubleshooting
Effective operation of Prisma Access relies on proactive monitoring, robust logging and reporting capabilities, efficient troubleshooting methodologies, and a clear understanding of the update processes. This document covers these critical operational aspects.
Part 1: Monitoring Prisma Access Health
Regularly monitoring the health and performance of your Prisma Access deployment is crucial for identifying potential issues before they impact users and for ensuring optimal service delivery.
Using the Prisma Access Insights Dashboard
- Primary Tool: Prisma Access Insights is the central dashboard for monitoring. It's integrated into the Prisma SASE Cloud Management UI and accessible via the Hub. Panorama users also rely heavily on Insights via the Hub, though some status is visible in the Cloud Services plugin.
- Key Sections/Dashboards:
- Summary/Overview: Provides a high-level view of service status, active users/sites, bandwidth usage, alerts, and top threats.
- Mobile Users: Monitors connectivity, authentication, latency, bandwidth usage per user, location, OS, and GlobalProtect version.
- Remote Networks: Tracks tunnel status (IPSec/BGP), latency, jitter, packet loss, bandwidth utilization per site, and configuration status.
- Service Connections: Similar metrics to Remote Networks, focused on datacenter connectivity.
- Alerts: Centralized view of triggered alerts based on configured thresholds.
- Security: Dashboards visualizing threat activity, URL filtering trends, WildFire submissions, etc.
- Data Source: Insights primarily visualizes telemetry and log data collected by Prisma Access and stored in Cortex Data Lake.
- Time Range Selection: Allows analyzing trends over different time periods (e.g., last hour, 24 hours, 7 days).
Key Metrics to Monitor
- Connectivity Status:
- Tunnel Status (RN/SC): Up/Down status for IPSec and BGP sessions. Frequent flaps indicate instability.
- Mobile User Connections: Number of active users, successful vs failed connection attempts.
- Performance Metrics:
- Latency: Round-trip time between user/site and Prisma Access PoP, and potentially to application destinations. High latency impacts user experience. Monitor average and peak latency per location/user.
- Jitter & Packet Loss (RN/SC): Especially important for real-time applications (VoIP/Video). High jitter/loss indicates network instability, often on the local ISP link.
- Bandwidth Utilization: Monitor aggregate and per-site/user bandwidth usage against licensed limits. High sustained utilization may require upgrades or QoS.
- Resource Utilization (Prisma Access Infrastructure): While managed by Palo Alto Networks, Insights may provide indicators of SPN or other infrastructure health/load, though direct resource metrics are generally abstracted. Palo Alto Networks SRE teams monitor the backend infrastructure.
- Threat Activity: Top threats detected (malware, C&C, phishing), blocked URLs, WildFire verdicts. Spikes may indicate targeted attacks or outbreaks.
- Authentication Failures: High rates of GlobalProtect login failures can indicate IdP issues, credential problems, or configuration errors.
- HIP Compliance Status: Monitor the number of compliant vs non-compliant devices connecting.
Alerting Configuration
- Purpose: Proactively notify administrators about critical events or threshold breaches.
- Configuration Location: Typically configured within Prisma Access Insights (via the Hub/Prisma SASE UI).
- Configurable Alerts (Examples):
- Tunnel Down (RN/SC)
- BGP Peer Down (RN/SC)
- High Latency Threshold Exceeded
- High Bandwidth Utilization (Aggregate or Per-Site)
- High Packet Loss/Jitter
- Excessive Authentication Failures
- Critical Threat Detections
- License Usage Nearing Limit
- Infrastructure/Service Degradation Events (often notified by Palo Alto Networks directly via status page/email as well)
- Notification Methods: Email, Webhooks (for integration with tools like Slack, PagerDuty), SNMP Traps (via Panorama if configured).
- Best Practice: Configure alerts for key failure conditions (tunnels, BGP) and critical performance thresholds. Avoid overly noisy alerts. Tune thresholds based on baseline performance.
Part 2: Logging and Reporting
Prisma Access leverages Cortex Data Lake (CDL) for centralized, scalable logging. Accessing and interpreting these logs is vital for troubleshooting, security analysis, and compliance reporting.
Querying Logs in Cortex Data Lake
- Access Methods:
- Panorama Monitor Tab: Familiar interface for Panorama users. Provides querying, filtering, and predefined views for various log types (Traffic, Threat, URL, etc.) stored in CDL.
- Prisma SASE UI (Monitor): Cloud-managed interface offers similar log viewing and filtering capabilities.
- Prisma Access Insights: Uses CDL data but presents it in aggregated, dashboard-focused views rather than raw log lines.
- Cortex XDR Console (Explore): If using Cortex XDR, its 'Explore' feature provides powerful querying (using XQL) directly against CDL data.
- Cortex Data Lake App (Hub): Provides direct access to query logs using specific syntax, manage log forwarding, and view storage consumption.
- Filtering and Querying:
- Use the filter builders provided in Panorama/Cloud UI.
- Construct queries using logical operators (AND, OR, NOT) and specific field values (e.g.,
(addr.src in 10.1.1.5) and (action eq deny)
). Syntax varies slightly depending on the tool (Panorama filters vs XQL).
- Filter by time range, log type, source/destination IP, user, application, action, threat type, URL category, etc.
Conceptual Log Query Process
flowchart TD
A[Admin Needs Info] --> B{Choose Access Method};
B -- Panorama --> C[Panorama Monitor Tab];
B -- Cloud UI --> D[Prisma SASE Monitor];
B -- XDR --> E[Cortex XDR Explore];
B -- Hub --> F[CDL App Query];
subgraph Query Interface
C --> G{Build Filter / Query};
D --> G;
E --> G;
F --> G;
end
G -- Query Sent --> H((Cortex Data Lake));
H -- Log Results --> I{Display Results};
subgraph Results Display
C --> I;
D --> I;
E --> I;
F --> I;
end
I --> J[Admin Analyzes Logs];
Using Predefined and Custom Reports
- Panorama Reporting:
- Predefined Reports: Offers numerous built-in reports for traffic summaries, threat analysis, URL activity, user activity, VPN usage, etc. (accessible via
Panorama > Monitor > Manage Custom Reports
).
- Custom Reports: Allows creating tailored reports based on specific log types, filters, grouping criteria, and timeframes.
- Report Groups: Schedule generation and emailing of multiple reports.
- Prisma Access Insights: While dashboard-focused, Insights essentially provides predefined, interactive reports on various aspects of the service. Data can often be exported.
- Cloud Management UI: May offer built-in reporting capabilities similar to Panorama's predefined reports.
- Use Cases: Compliance reporting, security posture summaries, usage trend analysis, executive summaries.
Interpreting Different Log Types
- Traffic Logs: Show session start/end information. Key fields: Source/Destination IP/Port/Zone/User, Application (App-ID), Rule Matched, Action (allow/deny/drop), Bytes Sent/Received, Session End Reason. Essential for connectivity troubleshooting.
- Threat Logs: Detail detected threats. Key fields: Threat Type (virus, spyware, vulnerability, wildfire), Severity, Action Taken (alert, block, sinkhole), Source/Destination IP/User, Threat Name/ID, URL (if applicable). Crucial for security incident analysis.
- URL Filtering Logs: Show web activity. Key fields: Source IP/User, Destination URL, URL Category, Action (allow, block, continue, override), Rule Matched. Used for web usage analysis and troubleshooting web access issues.
- User-ID Logs: Track IP-to-User mappings learned via GP login, SAML, etc., and group mapping events. Key for verifying user identification and group membership used in policies.
- GlobalProtect Logs (System Logs): Show GP Portal/Gateway events - successful/failed logins, authentication details, HIP check results, tunnel establishment events. Filter by `(subtype eq globalprotect)`. Vital for MU connection issues.
- Tunnel Logs (System Logs): Show IPSec IKE and IPSec negotiation events for RN/SC tunnels. Filter by `(subtype eq vpn)`. Essential for site-to-site VPN troubleshooting. Look for Phase 1 / Phase 2 success or failure messages.
- System Logs: General device/service health messages, including BGP peering events (`subtype eq routing`), configuration changes, hardware/software events (though much infrastructure is managed by PANW).
- Configuration Logs: Audit trail of configuration changes made via Panorama or Cloud UI, showing who changed what and when.
- DNS Security Logs (Threat Logs): Show DNS queries identified as malicious. Look for Threat Type `dns`.
- DLP Logs: Show detected data pattern matches and actions taken (requires DLP license and configuration).
Part 3: Troubleshooting Common Issues
A systematic approach is key to resolving issues efficiently.
Mobile User Connectivity Problems
Troubleshooting Steps:
- Check Client Status & Logs: Start with the GlobalProtect client itself. Check the status panel for error messages. Collect logs from the client (Settings > Troubleshooting > Collect Logs).
- Verify Network Connectivity: Can the user reach the GP Portal address from their current location? Check local firewall, ISP issues.
- Authentication Failures:
- Check credentials, account lockouts (AD/IdP).
- Examine GP System Logs on Panorama/CDL for detailed authentication failure reasons (filter by user, `subtype eq globalprotect`).
- Verify SAML IdP configuration and status. Check IdP logs.
- Check RADIUS/LDAP server connectivity and configuration if used.
- Gateway Connection Issues:
- Check if Portal assigned gateways correctly.
- Look for tunnel negotiation errors in GP System Logs.
- Verify IP pool availability for mobile users.
- Check Security Policies blocking GP traffic (less common).
- Split Tunnel Problems:
- Verify split tunnel configuration (include/exclude lists, applications, domains) in the GP Agent config (Portal).
- Check client's effective routes (e.g., `ipconfig /all`, `route print` on Windows). Is traffic destined for the tunnel or direct?
- Confirm DNS resolution is working correctly for split-tunneled domains.
- HIP Failures:
- Check HIP Match logs on Panorama/CDL (filter by `subtype eq hip-match`). See which object/profile failed.
- Verify the HIP configuration on the client matches the requirements (e.g., AV running, disk encrypted).
- Check Security Policy rules using the HIP profile – are non-compliant users blocked as expected?
- Performance Issues: Check latency to the connected gateway (GP Client Status panel). Test throughput. May indicate local network issues or needing to connect to a closer gateway.
Remote Network Tunnel Down Issues
Troubleshooting Steps:
- Check Status (Prisma Access & Branch): Verify tunnel status in Prisma Access Insights / Cloud Services Status / Monitor tab. Check status on the branch device CLI/UI.
- Verify Configuration Parameters: Mismatched parameters are the most common cause. Double-check meticulously on both Prisma Access (Panorama/Cloud UI) and the branch device:
- IKE Version (v1/v2)
- IPSec/IKE Crypto Profiles (Encryption, Auth, DH Group, Lifetimes - ensure *exact* match)
- Pre-Shared Key (check for typos, complexity issues) or Certificate validity/trust.
- Local/Peer IDs (IP Address, FQDN, Key ID - must match expectations).
- Tunnel Interface IP addresses (if numbered).
- Proxy IDs (if used - ensure match or use route-based 0.0.0.0/0).
- Check Network Connectivity: Can the branch device's public IP reach the Prisma Access public IPs for the assigned PoP? Check intervening firewalls, routing, ISP issues.
- Examine Logs:
- Prisma Access:** Check System logs (`subtype eq vpn`) on Panorama/CDL. Filter by the branch peer IP. Look for IKE Phase 1 / Phase 2 negotiation errors (e.g., "No proposal chosen", "Authentication failed", "Invalid ID").
- Branch Device:** Check the corresponding VPN/IPSec/IKE logs on the branch device. Error messages often provide specific clues.
- Debug Commands (Branch Device - Examples): Commands vary by vendor.
- Palo Alto NGFW (On-Prem):
debug ike global on debug
debug ike pcap on
less mp-log ikemgr.log
less mp-log vpn.log
show vpn ike-sa gateway
show vpn ipsec-sa tunnel
clear vpn ike-sa gateway
clear vpn ipsec-sa tunnel
- Cisco IOS Router:
debug crypto isakmp
debug crypto ipsec
show crypto isakmp sa
show crypto ipsec sa
clear crypto isakmp
clear crypto ipsec sa peer
- BGP Issues (If Tunnel is Up):
Service Connection Problems
- Troubleshooting Steps: Very similar to Remote Networks, as they use the same underlying IPSec/BGP technologies.
- Key Differences/Focus:
- Often higher bandwidth and stricter HA requirements.
- Routing complexity might be higher due to datacenter networks.
- Ensure correct subnets are advertised from the datacenter via BGP.
- Verify Prisma Access Mobile User IP pools are correctly routed towards the Service Connection within the corporate network.
Policy Enforcement Issues
Troubleshooting Steps:
- Identify the Specific Flow: Source IP/User, Destination IP/URL/App, Port/Service.
- Check Traffic Logs: Filter logs by the specific flow parameters.
- Rule Match:** Which Security Policy rule did the traffic match? Is it the expected rule?
- Action:** Was the action allow/deny as expected?
- Session End Reason:** If denied/dropped, why? (e.g., "policy-deny", "threat", "url-block"). If allowed but app doesn't work, could be "tcp-reset-from-server" or other application-level issue.
- Verify Rule Order: Remember top-down evaluation. Is a rule higher up matching unexpectedly?
- Check Policy Objects: Are Address Objects, Service Objects, App-IDs, URL Categories, User/Groups defined correctly and accurately matching the traffic? (e.g., typo in IP, wrong App-ID, user not in expected group).
- Verify User-ID / Group Mapping: Check User-ID logs. Does Prisma Access have the correct IP-to-User mapping? Is the user correctly mapped to the group used in the policy?
- Verify HIP Profile Match: Check HIP Match logs if the policy uses HIP profiles. Is the device passing/failing the check as expected?
- Check NAT Policy: Is NAT being applied correctly (especially Source NAT for outbound)? Check NAT policy rules and Traffic logs (NAT source/destination fields).
- Check Decryption Policy: Is traffic being decrypted/not decrypted as intended? Is an exclusion rule matching unexpectedly? Are clients trusting the forward-trust CA?
- Use Policy Optimizer (Panorama): Helps identify unused rules, rules with port-based services that could use App-ID, shadowed rules.
- Global Find (Panorama): Search for objects (addresses, services) to see where they are used in policies.
Performance Issues (Latency, Throughput)
Troubleshooting Steps:
- Isolate the Scope: Is the issue affecting specific users, sites, applications, or destinations? Is it consistent or intermittent?
- Check Prisma Access Insights: Review latency, bandwidth, packet loss metrics for the affected users/sites.
- Check Local Network: Crucial first step. Is the user's local Wi-Fi, ISP connection, or branch network experiencing issues? Run speed tests, ping tests to external sites *bypassing* Prisma Access if possible (e.g., using split tunnel).
- Gateway Selection (MU): Is the user connected to the optimal (lowest latency) gateway? (Check GP client status).
- Bandwidth Limits: Is the user/site hitting their allocated bandwidth limits? Check Insights. Consider QoS.
- Application Performance: Is the issue specific to one application? Could it be the application server itself, not the network path?
- Decryption Impact: While generally well-handled, complex decryption scenarios could potentially add minimal latency. Test bypassing decryption for the specific flow if suspected (for testing only).
- Path Analysis: Use tools like `traceroute` or `mtr` (from the endpoint, if possible) towards the destination to identify potential high-latency hops along the path (which may be outside Prisma Access).
- Engage Support: If local network and Prisma Access metrics seem normal, but performance is poor, Palo Alto Networks support may need to investigate internal routing or SPN performance.
Panorama / Cloud Services Plugin Issues
- Plugin Not Connecting/Authenticated: Verify Panorama has internet access, DNS resolution works, and the Panorama service account can authenticate to the Hub. Re-authenticate via `Cloud Services > Service Setup`.
- Configuration Push Failures: Check Task Manager in Panorama for detailed error messages. Common causes: configuration errors (validation failures), plugin communication issues, transient cloud service problems. Review the configuration pushed for errors.
- Status Not Updating: Ensure Panorama can reach the Cloud Services infrastructure. Check plugin logs (`less mp-log cloud_services.log` or via `Technical Support File`).
- Plugin Version Compatibility: Ensure the Cloud Services plugin version is compatible with the Panorama software version and the Prisma Access backend version. Check release notes.
- Resource Issues on Panorama: Ensure the Panorama VM/appliance has sufficient CPU/Memory/Disk resources, especially during large commits or log queries.
Part 4: Software Updates
Keeping components updated is essential for security patches, bug fixes, and new features.
Prisma Access Infrastructure Updates
- Managed by Palo Alto Networks: The underlying infrastructure (SPNs, CANs, Gateways, Portals) software (PAN-OS version running in the cloud) is updated and patched by Palo Alto Networks.
- Maintenance Notifications: Customers are typically notified in advance via email and/or Hub notifications about scheduled maintenance windows for infrastructure updates.
- Version Selection: Customers usually select a preferred PAN-OS version for their Prisma Access instance from a list of qualified versions provided by Palo Alto Networks (configured via Cloud Services plugin or Cloud UI). Updates to new major versions might require customer scheduling/approval.
- Hitless Updates (Goal): Updates are designed to be hitless or minimally disruptive, leveraging infrastructure redundancy.
GlobalProtect Client Updates
- Deployment Methods:
- Via GlobalProtect Portal:** Upload desired client versions to the Portal configuration (Panorama/Cloud UI). Configure agent settings to allow/prompt/force upgrades. Clients check the portal periodically.
- Manual Deployment:** Distribute client installers via software deployment tools (SCCM, Intune, Jamf), email, or download links.
- Version Control: Configure minimum allowed client versions in the GP Portal/Agent settings.
- Testing: Recommended to test new client versions with a pilot group before broad deployment.
- Release Cadence: Palo Alto Networks releases new GP client versions regularly with fixes and features. Check release notes.
Panorama Cloud Services Plugin Updates
- Manual Process: Updating the plugin on Panorama is a manual process.
- Steps:
- Download the new compatible plugin version from the Support Portal.
- Upload and install via `Panorama > Plugins`. (Requires Panorama restart).
- Compatibility: Crucial to check release notes for compatibility between the plugin version, Panorama software version, and the Prisma Access backend PAN-OS version.
- When to Update: Update when required for compatibility with new Panorama/Prisma Access features, for bug fixes listed in the release notes, or as advised by support.