How do I account for memory effects in a transistor model for digital predistortion applications?
PA Memory Effects for DPD
Memory effects are the primary limitation of DPD performance. A memoryless DPD can improve the ACLR by 15-25 dB. Adding memory correction provides an additional 5-15 dB improvement.
Technical Considerations
(1) Two-tone test with varying tone spacing: apply two equal-amplitude tones at f0 ± Δf/2. Measure the IM3 levels at 2f1-f2 and 2f2-f1. Vary Δf from 100 kHz to 100 MHz. If memory effects are absent: the IM3 level and phase are constant vs Δf. If memory effects are present: the IM3 level varies with Δf (typically: higher IM3 at low Δf and lower at high Δf, or vice versa). The asymmetry between upper and lower IM3 (IM3_upper ≠ IM3_lower) is a signature of memory effects. (2) AM-AM/AM-PM vs modulation bandwidth: measure the PA AM-AM and AM-PM curves using signals with different bandwidths (10 MHz, 50 MHz, 100 MHz). If memory is present: the AM-AM curve changes shape with bandwidth (the compression point and gain roll-off differ). This is because the thermal and trapping dynamics interact differently with the modulation speed. (3) Load-pull under modulated signal: measure the PA load-pull contours using a modulated signal (instead of CW). Memory effects cause the optimal load impedance to depend on the modulation bandwidth and crest factor.
- Performance verification: confirm specifications against the application requirements before finalizing the design
- Environmental factors: temperature range, humidity, and vibration affect long-term reliability and parameter drift
- Cost vs. performance: evaluate whether the application demands premium components or standard commercial grades
- Interface compatibility: verify impedance, connector type, and mechanical form factor match the system architecture
- Margin allocation: include sufficient design margin to account for manufacturing tolerances and aging effects
Performance Analysis
(1) Generalized memory polynomial (GMP): the most widely used DPD model. y_predistorted(n) = sum_m sum_k (a_mk × x(n-m) × |x(n-m-k)|²). Where m = memory depth (0 to M, typically M = 3-7), k = cross-term index (captures lagging and leading memory), and a_mk = complex coefficients (obtained by least-squares fitting to the inverse of the PA behavior). The GMP captures both the nonlinearity (through the |x|² terms) and the memory (through the delayed x(n-m) terms). (2) Volterra-based DPD: uses the full Volterra series inverse. More general than GMP but much more computationally expensive. Typically truncated to third order with 3-5 memory taps. (3) Neural network DPD: uses a neural network (typically 2-3 hidden layers) to learn the PA inverse function including memory. No assumptions about the PA model structure. Can capture complex memory effects that polynomial models miss. Higher computational cost. Used in: advanced 5G base stations where the PA exhibits strong combined thermal and trapping memory. (4) Implementation: the DPD coefficients are computed in real time by a feedback loop: the PA output is sampled by an observation receiver, compared to the desired output, and the DPD coefficients are updated. The coefficient update rate is typically 1-10 ms (fast enough to track slow thermal and aging changes).
Frequently Asked Questions
Do all PAs have memory effects?
All PAs have some memory effects, but the significance depends on: (1) Technology: GaN PAs have the strongest memory (trapping + thermal). GaAs PAs have moderate memory (thermal only, no significant trapping). Si CMOS/SiGe PAs have moderate thermal memory. (2) Signal bandwidth: narrowband signals (< 5 MHz): memory effects are often negligible (the signal changes slowly compared to the memory time constants). Wideband signals (20-100 MHz 5G NR): memory effects are significant and must be compensated. (3) Operating point: PA operated in deep compression has stronger memory (the nonlinearity interacts more strongly with the dynamic effects). PA operated with 6+ dB back-off has weaker memory (the device stays closer to the linear region). Rule: if ACLR spec is < -45 dBc and the signal BW is > 20 MHz: memory effects will limit the DPD performance if not accounted for.
How many memory taps do I need in my DPD?
The number of memory taps (M) in the GMP determines how far into the past the DPD looks. Each tap adds one sample delay (at the DPD sampling rate): for 100 MHz sampling: one tap = 10 ns. Thermal memory: tau_th = 1-100 ms = 100,000-10,000,000 samples. It is impractical to have this many taps. Instead: thermal memory is captured by a slow adaptation loop (the DPD coefficients are updated every few ms based on the average PA behavior). Trapping memory: tau_trap = 1-100 us = 100-10,000 samples. A few taps (3-7 at the symbol rate) capture the fast component. The slow component is captured by the coefficient adaptation. Bias memory: typically < 1 us (fast decoupling). Captured by the first 1-3 taps. In practice: M = 3-7 memory taps at the symbol rate provides adequate performance for most 5G PAs. Increasing beyond 7 taps provides diminishing returns and increases the DPD computation cost.
How does the DPD adapt to changing conditions?
The DPD uses a closed-loop adaptation: (1) Observation receiver: a low-power copy of the PA output is digitized by an ADC (the observation receiver). (2) Comparison: the observed output is compared to the desired output (the input signal before DPD). (3) Error calculation: the difference (error) between observed and desired is computed. (4) Coefficient update: the DPD coefficients (a_mk) are updated to minimize the error using least-squares (LMS or RLS algorithm). (5) Update rate: typically every 1-10 ms (fast enough to track temperature changes and aging). This closed-loop adaptation automatically compensates for: temperature changes (the PA characteristics change with ambient and junction temperature), aging (gradual degradation of the PA over months/years), component tolerances (initial mismatch between the PA and the DPD model), and supply voltage variation (battery voltage droop in UE devices).