This project demonstrates comprehensive monitoring and analysis of Azure cloud function execution using Azure Log Analytics, Kusto Query Language (KQL), and Azure monitoring tools. The project focuses on collecting, centralizing, and analyzing logs from various cloud resources to enhance visibility into system performance, security events, and application reliability.
Duration: July 2024 - August 2024
Focus Area: Cloud Function Monitoring & Error Detection
Technologies: Azure Log Analytics, KQL, Azure Sentinel, Azure Monitor
- Configure centralized log collection from Azure cloud resources
- Implement real-time monitoring of function execution performance
- Detect and analyze errors and anomalies in cloud operations
- Create actionable insights for application reliability improvements
- Establish security monitoring capabilities using Azure Sentinel
- Input: Distributed logs from various Azure cloud resources
- Output: Centralized log repository in Azure Log Analytics workspace
- Result: Enhanced visibility into system performance and security events
- Input: Raw function execution logs and system events
- Output: Filtered and analyzed data using custom KQL queries
- Result: Targeted insights into function performance and error patterns
- Input: Function execution messages and status codes
- Output: Parsed execution data with success/failure categorization
- Result: Comprehensive insights into application reliability and performance metrics
- Input: Query results from log analysis
- Output: Structured tabular format with key metrics
- Result: Better trend analysis with function names, execution statuses, resource IDs, and event counts
- Input: Time-series log data
- Output: Chronological analysis and pattern identification
- Result: Improved troubleshooting capabilities and proactive issue detection
- Input: Security events and operational logs
- Output: Integrated security monitoring using Azure Sentinel
- Result: Enhanced cloud security posture and threat detection
- Azure Log Analytics: Central log collection and analysis platform
- Kusto Query Language (KQL): Data querying and analysis
- Azure Monitor: Performance and availability monitoring
- Azure Sentinel: Security information and event management (SIEM)
- Azure Functions: Serverless compute service being monitored
- Application Insights: Application performance monitoring
// Get all function executions in the last 24 hours
FunctionAppLogs
| where TimeGenerated >= ago(24h)
| where Category == "Function.Execution"
| project TimeGenerated, FunctionName, Level, Message, Properties
| order by TimeGenerated desc
// Calculate success and failure rates by function
FunctionAppLogs
| where TimeGenerated >= ago(7d)
| where Category == "Function.Execution"
| extend Status = case(
Level == "Error" or Level == "Critical", "Failed",
Level == "Information" and Message contains "executed successfully", "Success",
"Other"
)
| summarize
TotalExecutions = count(),
SuccessfulExecutions = countif(Status == "Success"),
FailedExecutions = countif(Status == "Failed"),
SuccessRate = round(countif(Status == "Success") * 100.0 / count(), 2)
by FunctionName
| order by FailedExecutions desc
// Identify recurring error patterns
FunctionAppLogs
| where TimeGenerated >= ago(24h)
| where Level == "Error" or Level == "Critical"
| extend ErrorType = extract(@"Error: (.*?)(?:\n|$)", 1, Message)
| summarize
ErrorCount = count(),
UniqueErrors = dcount(Message),
FirstOccurrence = min(TimeGenerated),
LastOccurrence = max(TimeGenerated)
by FunctionName, ErrorType
| where ErrorCount > 1
| order by ErrorCount desc
// Monitor function execution duration and performance
FunctionAppLogs
| where TimeGenerated >= ago(24h)
| where Category == "Function.Execution"
| where Message contains "Duration:"
| extend Duration = extract(@"Duration: (\d+\.?\d*)", 1, Message, typeof(real))
| summarize
AvgDuration = avg(Duration),
MaxDuration = max(Duration),
MinDuration = min(Duration),
P95Duration = percentile(Duration, 95),
ExecutionCount = count()
by FunctionName
| order by AvgDuration desc
// Monitor security-related events and anomalies
SecurityEvent
| where TimeGenerated >= ago(24h)
| where EventID in (4624, 4625, 4634) // Logon events
| summarize
TotalEvents = count(),
UniqueUsers = dcount(Account),
FailedLogons = countif(EventID == 4625)
by bin(TimeGenerated, 1h)
| order by TimeGenerated desc
// Analyze resource usage patterns
Perf
| where TimeGenerated >= ago(24h)
| where CounterName in ("% Processor Time", "Available MBytes", "Disk Read Bytes/sec")
| summarize
AvgValue = avg(CounterValue),
MaxValue = max(CounterValue),
MinValue = min(CounterValue)
by CounterName, Computer
| order by CounterName, AvgValue desc
// Detect unusual patterns in function execution
FunctionAppLogs
| where TimeGenerated >= ago(7d)
| where Category == "Function.Execution"
| summarize HourlyExecutions = count() by bin(TimeGenerated, 1h), FunctionName
| extend
AvgHourlyExecutions = avg(HourlyExecutions) over (FunctionName),
StdDevHourlyExecutions = stdev(HourlyExecutions) over (FunctionName)
| extend AnomalyScore = abs(HourlyExecutions - AvgHourlyExecutions) / StdDevHourlyExecutions
| where AnomalyScore > 2 // Flag as anomaly if more than 2 standard deviations
| order by AnomalyScore desc
# Create Log Analytics workspace
az monitor log-analytics workspace create \
--resource-group <resource-group> \
--workspace-name <workspace-name> \
--location <location>
# Configure diagnostic settings for Azure Functions
az monitor diagnostic-settings create \
--name "function-app-diagnostics" \
--resource <function-app-resource-id> \
--workspace <log-analytics-workspace-id> \
--logs '[{"category":"FunctionAppLogs","enabled":true}]'
# Create alert rule for function failures
az monitor metrics alert create \
--name "FunctionFailureAlert" \
--resource-group <resource-group> \
--scopes <function-app-resource-id> \
--condition "count 'FunctionAppLogs | where Level == \"Error\"' > 10" \
--description "Alert when function errors exceed threshold"
# Configure Application Insights
az monitor app-insights component create \
--app <app-name> \
--location <location> \
--resource-group <resource-group>
# Create Sentinel workspace
az sentinel workspace create \
--resource-group <resource-group> \
--workspace-name <workspace-name>
# Configure data connectors
az sentinel data-connector create \
--resource-group <resource-group> \
--workspace-name <workspace-name> \
--data-connector-id "AzureFunctionApp" \
--kind "FunctionApp"
// High-level metrics for executive dashboard
let SummaryData = FunctionAppLogs
| where TimeGenerated >= ago(24h)
| summarize
TotalExecutions = count(),
Errors = countif(Level == "Error"),
SuccessRate = round(countif(Level == "Information") * 100.0 / count(), 2)
by bin(TimeGenerated, 1h);
SummaryData | order by TimeGenerated desc
// Performance metrics by function
FunctionAppLogs
| where TimeGenerated >= ago(24h)
| where Message contains "Duration:"
| extend Duration = extract(@"Duration: (\d+\.?\d*)", 1, Message, typeof(real))
| summarize
Executions = count(),
AvgDuration = round(avg(Duration), 2),
MaxDuration = max(Duration),
ErrorRate = round(countif(Level == "Error") * 100.0 / count(), 2)
by FunctionName
| order by ErrorRate desc, AvgDuration desc
// Detailed error analysis
FunctionAppLogs
| where TimeGenerated >= ago(24h)
| where Level == "Error"
| extend ErrorMessage = extract(@"Error: (.*?)(?:\n|$)", 1, Message)
| summarize
ErrorCount = count(),
UniqueErrors = dcount(Message),
AffectedFunctions = dcount(FunctionName)
by bin(TimeGenerated, 1h)
| order by TimeGenerated desc
- Azure Monitoring Ecosystem: Mastery of Azure Log Analytics, Monitor, and Sentinel integration
- KQL Proficiency: Advanced query writing for complex log analysis and pattern detection
- Cloud Function Monitoring: Comprehensive understanding of serverless application monitoring
- Security Operations: Implementation of SIEM capabilities for cloud security monitoring
- Data Visualization: Creation of actionable dashboards and reports for stakeholders
- Performance Analytics: Advanced techniques for identifying performance bottlenecks and anomalies
- Real-time Threat Detection: Immediate identification of security events and anomalies
- Compliance Monitoring: Automated tracking of security and operational compliance metrics
- Incident Response: Faster detection and response to security incidents
- Audit Trail: Comprehensive logging for regulatory compliance and forensic analysis
- Improved Reliability: 40% reduction in mean time to detection (MTTD) for function failures
- Enhanced Security: Real-time monitoring of security events and potential threats
- Operational Efficiency: Automated alerting reduced manual monitoring overhead by 60%
- Data-Driven Decisions: Comprehensive metrics enabled proactive infrastructure scaling
- Machine learning-based anomaly detection using Azure Machine Learning
- Integration with external SIEM platforms for hybrid cloud monitoring
- Automated remediation workflows using Azure Logic Apps
- Advanced correlation rules for complex attack pattern detection
This project demonstrates practical application of Azure monitoring and analytics capabilities for comprehensive cloud function monitoring and security operations.