-
Notifications
You must be signed in to change notification settings - Fork 23
Commit f54c7c6
committed
perf(data): optimize struct marshaling/unmarshaling with caching and … (#1117)
Because
- The data marshaling/unmarshaling framework was performing expensive
operations repeatedly, including:
- Reflection type computations on every operation
- Regex pattern compilation for validation on each use
- Instill tag parsing without caching
- Time format slice creation for every time parsing operation
- File type checking with repeated type slice creation
- These operations were causing performance bottlenecks in
high-throughput scenarios
- Memory allocations were not optimized, leading to unnecessary garbage
collection overhead
**This commit**
- **Implements reflection type caching**: Pre-computes `reflect.Type`
instances for commonly used types (`time.Time`, `time.Duration`,
`format.Value`, etc.) to avoid repeated `reflect.TypeOf()` calls
- **Adds regex pattern caching**: Implements thread-safe LRU cache for
compiled regex patterns with `sync.RWMutex` protection to eliminate
repeated compilation overhead
- **Introduces tag parsing cache**: Caches parsed instill tag results to
avoid expensive string parsing operations on every field processing
- **Pre-compiles time formats**: Stores common time parsing formats in
global variables to eliminate repeated slice creation
- **Pre-computes file type slices**: Maintains global file type arrays
for efficient type checking operations
- **Enhances JSON string to struct conversion**: Adds fast pre-check for
JSON-like strings to minimize unnecessary parsing attempts
- **Consolidates test suite**: Combines all benchmarks into
`struct_test.go` for comprehensive performance tracking
## Performance Improvements (Benchmarked)
| Optimization | Before | After | Improvement | Memory Impact |
| ------------------------------- | --------------------------- |
------------------------- | ------------------ |
---------------------------- |
| **Reflection Type Caching** | 5.146 ns/op | 0.2555 ns/op | **20.1x
faster** | 0 B/op (no allocations) |
| **Regex Pattern Caching** | Repeated compilation | LRU cached
compilation | **12.6x faster** | 100% memory reduction |
| **Tag Parsing Optimization** | String parsing every access | Cached
parsing results | **13.0x faster** | 100% memory reduction |
| **Time Format Pre-compilation** | Format slice creation | Pre-compiled
arrays | **1.03x faster** | Eliminates slice allocations |
| **File Type Checking** | Type slice creation | Pre-computed global
array | **6.2x faster** | Eliminates reflection calls |
### Overall Performance Metrics
- **Complete Struct Unmarshaling**: 1635 ns/op, 496 B/op, 14 allocs/op
- **Complete Struct Marshaling**: 1476 ns/op, 1184 B/op, 23 allocs/op
- **Concurrent Access**: Regex cache (114.4 ns/op), Tag cache (98.70
ns/op)
## Technical Implementation Details
### 🏗️ **Architecture Enhancements**
1. **Pre-computed Global Variables**:
```go
var (
timeTimeType = reflect.TypeOf(time.Time{})
timeDurationType = reflect.TypeOf(time.Duration(0))
formatValueType = reflect.TypeOf((*format.Value)(nil)).Elem()
// ... more pre-computed types
)
```
2. **Thread-Safe Caching**:
```go
type regexCache struct {
cache map[string]*regexp.Regexp
mu sync.RWMutex
}
```
3. **LRU Cache Implementation**:
- Automatic eviction of least recently used entries
- Configurable cache sizes for different use cases
- Double-checked locking for optimal performance
4. **Fast JSON Pre-check**:
```go
if len(stringValue) > 1 && (stringValue[0] == '{' || stringValue[0] ==
'[') {
// Only attempt JSON parsing for JSON-like strings
}
```
### 🧪 **Comprehensive Test Coverage**
- **287 unit tests** covering all functionality
- **9 benchmark suites** measuring performance improvements
- **Edge case testing** for concurrent access patterns
- **Memory allocation profiling** to prevent regressions
- **Performance regression detection** through continuous benchmarking
### 🔒 **Thread Safety & Reliability**
- All caches use `sync.RWMutex` for concurrent access
- Double-checked locking patterns for initialization
- Graceful fallback for cache misses
- Zero breaking changes to existing APIs1 parent 9e06b7c commit f54c7c6Copy full SHA for f54c7c6
File tree
Expand file treeCollapse file tree
2 files changed
+475
-53
lines changedFilter options
- pkg/data
Expand file treeCollapse file tree
2 files changed
+475
-53
lines changed
0 commit comments