This notebook contains the analysis and results of an extensive phase overlay measurement campaign.

Preparations

Let’s start with loading my most favorite R libraries:

library(data.table) # for efficient data handling
library(ggplot2) # for nice plots
library(stringi) # string ops

First, we need to load all CSV files containing the measurement results. Each file contains the measurements of a specific configuration of the parameters \(M\) (number of symbols), transmit gain, and carrier frequency offset.

# some paths
path_clean = "/home/matze/code/ads-b_phase_overlay/experiments/dat"
path_noise = "/home/matze/code/ads-b_phase_overlay/experiments/dat_noise"
path_freqo = "/home/matze/code/ads-b_phase_overlay/experiments/dat_freq"

# function loads and prepares all measurement data in a specific path
load_all <- function (path) {
  files = list.files(path, pattern = "\\.csv$")
  fileregex = "^exp_([0-9]+)BITS_([0-9]+(\\.[0-9]+)?)MHz_([-+]?[0-9]+(.[0-9]+)?)dB(_([-+]?[0-9]+(.[0-9]+))dB)?(_noise)?(_freqs)?.csv$"
  
  all = data.table()
  for (f in sort(files)) {
    # read file
    tmp = fread(paste0(path, "/", f))
    
    if (nrow(tmp) > 0) {
      # add software parameters from file name
      pars = stri_match(f, regex = fileregex)
      tmp$m = 2^as.integer(pars[2]) # number of symbols
      tmp$fc = as.numeric(pars[3])  # carrier frequency
      tmp$hwgain = as.numeric(pars[5]) # hardware gain (0 is full scale)
      tmp$attenuation = ifelse(is.na(pars[8]), 0, -as.numeric(pars[8])) # attenuation
      tmp$timestamp = tmp$timestamp - min(tmp$timestamp) # we want relative 
      tmp$file = f
      tmp$psk_size = 112 * log2(tmp$m)
      
      if (nrow(all) > 0) {
        tmp$timestamp = tmp$timestamp + max(all$timestamp) + 10e9 # 10s gap
      }
      
      # append to full data set
      all = rbind(all, tmp)
    }
  }
  
  # add some useful fields
  all$snr = all$level_signal - all$level_noise # SNR
  all$fo = all$fc-1090 # carrier frequency offset
  all$timestamp = as.numeric(all$timestamp)
  all$gain = all$hwgain - all$attenuation
  
  # remove all data points that are not coming from our tests
  all = all[aircraft_address == "ffffff"]
  
  # ignore those with gain >= 25
  all = all[hwgain < 25]
  
  return(all)
}

sprintf("Loading clean data.")

## [1] "Loading clean data."

dat_clean = load_all(path_clean)
sprintf("The cleat data set contains %d measurements from %d files",
        nrow(dat_clean), length(unique(dat_clean$file)))

## [1] "The cleat data set contains 1294674 measurements from 130 files"

sprintf("Loading noise data.")

## [1] "Loading noise data."

dat_noise = load_all(path_noise)
dat_noise = dat_noise[level_signal != -Inf]
sprintf("The noise data set contains %d measurements from %d files",
        nrow(dat_noise), length(unique(dat_noise$file)))

## [1] "The noise data set contains 280366 measurements from 55 files"

sprintf("Loading frequency offset data.")

## [1] "Loading frequency offset data."

dat_freqo = load_all(path_freqo)
sprintf("The clean data set contains %d measurements from %d files",
        nrow(dat_freqo), length(unique(dat_freqo$file)))

## [1] "The clean data set contains 2955177 measurements from 255 files"

Setup Validation

In order to verify the plausibility of our measurements, we start with looking at some raw measurement values.

Gain vs. Received Signal Strength

Let’s begin with the received signal strength vs. gain:

ggplot(dat_clean,
       aes(x = factor(gain, levels = 0:50),
           y = level_signal)) + geom_boxplot() +
  scale_x_discrete(drop=FALSE, breaks=seq(0, 40, 5)) +
  xlab("Gain (dB)") + ylab("Received Signal Strength (dBm)")

This plot shows several things:

The minimum trigger level of the GRX1090 is at about -100dBm (median)
The transmit gain can be mapped to power in dBm by subtracting a constant offset of 100dB
Clipping occurs at about -55dBm

Next, we will look at the offset between the received signal strength (RSS) and the transmit gain in more detail.

setups = data.table(
  starts = c(2, 23, 42),
  stops = c(23, 42, 51),
  id = c("90dB", "70dB", "60dB")
)

ggplot(dat_clean) +
  geom_rect(data = setups,
            aes(fill=id,
                xmin=starts, xmax=stops,
                ymin=-106, ymax=-97), alpha=0.1) +
  geom_boxplot(aes(x = factor(gain, levels = 0:50),
                   y = level_signal-gain)) +
  scale_x_discrete(drop=FALSE, breaks=seq(0, 50, 5)) +
  labs(fill="Attenuation") +
  xlab("Gain (dB)") + ylab("RSS-Tx Gain Offset (dB)")

And the distribution for unclipped values:

ggplot(dat_clean[gain < 40], # no clipping
       aes(x = level_signal-gain)) +
  geom_histogram(binwidth=0.1) +
  xlab("RSS-Tx Gain Offset (dB)") + ylab("Number of Measurements")

Interestingly, the offset seems to have two different centers at about -99.5 and at -97.5 dB. While there is certainly some small non-linear inaccuracy coming from the transmitter’s hardware gain, the two different centers of the distribution are likely a result from inaccuracies in the attenuators that we used. We removed/replaced some of them between the measurements, leading to the different distributions.

ggplot(dat_clean[gain <= 50], # no clipping
       aes(x = gain, y = level_signal-gain, color = as.factor(m))) +
  geom_point(alpha=0.2) +
  xlab("Tx Gain (dB)") + ylab("RSS-Tx Gain Offset (dB)") +
  labs(color = "M")

Now this plot provides some insights! First, RSS values only occur in discrete steps and the stepwidth depends on the gain. With higher signal strengths, the stepwidth become smaller. Moreover, the distribution of these discrete values depends on \(M\). One explanation for these effects could be a limitation resulting from the resolution of the ADC of the receier (12 bits). This theory is also strengthened by the fact that the stepwidths become larger for very small values.

Noise

Let’s have a look at the distribution of the noise level of each received signal:

ggplot(dat_clean, aes(x = level_noise)) + geom_histogram(binwidth = 0.5) +
  xlab("Noise Level (dBm)") + ylab("Number of Measurements")

Seems like there are some measurements with an elevated noise level. Let’s see if we find a pattern:

# plot noise level over time
ggplot(dat_clean, aes(x = as.factor(gain), y = level_noise)) +
  geom_boxplot(aes(color = as.factor(m))) +
  xlab("Gain (dB)") + ylab("Noise Level (dBm)") +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(color="M")

Seems like the noise floor estimation includes a little part of the signal’s energy when it becomes too strong. Reason could simply be that due to the bandwidth-limited rise and fall times of the pulses, they become wider the louder they are.

Phase Overlay Performance

Noise-free Channel

Alright, now that we know our setup a little better and trust that the measured values make sense, it’s time to look at the performance of the phase overlay.

It’s pretty clear that the bit error rate depends on the signal to noise ratio. So let’s have a look at this relationship:

ggplot(dat_clean, aes(x = as.factor(gain-99 - -102), y = psk_payload_errors/psk_size)) +
  geom_boxplot(aes(color=as.factor(m))) +
  xlab("SNR (dB)") + ylab("Bit Error Rate")

Ok, maybe a more concise plot:

ber_stats = dat_clean[, .(count = .N,
                          mean = mean(psk_payload_errors),
                          p95 = quantile(psk_payload_errors, 0.95),
                          smean = mean(psk_symbol_errors), 
                          sp95 = quantile(psk_symbol_errors, 0.95)),
                      by=.(gain, m)]
ber_stats$bits = 112 * log2(ber_stats$m)
ber_stats$rss = ber_stats$gain - 99
ber_stats$snr = ber_stats$rss - -102

ggplot(ber_stats, aes(x = rss, color = as.factor(m))) +
  geom_line(aes(y = p95/bits * 100, linetype="95-Percentile")) +
  geom_line(aes(y = mean/bits * 100, linetype="Mean")) +
  scale_linetype_manual(values=c("dotted", "solid")) +
  xlab("Received Signal Strength (dBm)") + ylab("Bit Errors (%)") +
  labs(color = "M", linetype="")

Let’s have a look at the total number of bits that are successfully transmitted for each \(M\):

ggplot(ber_stats, aes(x = rss, color = as.factor(m))) +
  geom_line(aes(y = bits - p95, linetype="95-Percentile")) +
  geom_line(aes(y = bits - mean, linetype="Mean")) +
  geom_line(aes(y = bits, linetype="Limit"), alpha=0.4) +
  scale_linetype_manual(values=c("dotted", "solid", "dashed")) +
  coord_cartesian(ylim = c(0, max(ber_stats$bits))) +
  xlab("Received Signal Strength (dBm)") + ylab("Correct Bits per Frame") +
  labs(color = "M", linetype="")

The last plot nicely illustrates that the theoretical added capacity is \(\log_2(M) * 112 bits\) and the higher \(M\), the smaller the signal strength range where the maximum capacity is available. The question now is, what does that mean in a real-world scenario? If we assume that a transponder transmits their signals at around 56dBm and the signal strength decreases with distance according to the standard free-space path loss model (FSPL), we can translate the received signal strength to distances. The FSPL in dB can be approximated as follows: \[\text{FSPL} = 20 \cdot \log_{10}\left(\frac{4 \pi d f}{c}\right) = 20 \cdot \log_{10}(d) + 20 \cdot \log_{10} \left(\frac{4 \pi f}{c}\right) = 20 \cdot \log_{10}(d) + 33.2~\text{dB}\] If we combine the FSPL with the transmit power of 56dBm, we can map the received signal strength to a distance as follows: \[d = 10^\frac{56~\text{dBm} - 33.2~\text{dB} - \text{RSS}}{20}\]

ber_stats$dist = 10^((56-33.2-ber_stats$rss)/20)

ggplot(ber_stats, aes(x = dist/1852, color = as.factor(m))) +
  geom_line(aes(y = bits - p95, linetype="95-Percentile")) +
  geom_line(aes(y = bits - mean, linetype="Mean")) +
  geom_line(aes(y = bits, linetype="Limit"), alpha=0.4) +
  scale_linetype_manual(values=c("dotted", "solid", "dashed")) +
  coord_cartesian(ylim = c(0, max(ber_stats$bits))) +
  xlab("Distance (NM)") + ylab("Correct Bits per Frame") +
  labs(color = "M", linetype="")

Let’s finally have a look at the symbol errors:

ggplot(ber_stats, aes(x = rss, color = as.factor(m))) +
  geom_line(aes(y = 100*sp95/112, linetype="95-Percentile")) +
  geom_line(aes(y = 100*smean/112, linetype="Mean")) +
  scale_linetype_manual(values=c("dashed", "solid")) +
  xlab("Signal Strength (dBm)") + ylab("Symbol Errors (%)") +
  labs(color = "M", linetype="")

Note that unless there is a systematic error, the worst-case symbol error rate is \(1 - 1/M\), i.e., the expected rate if symbols were randomly guessed. So if symbol errors are i.i.d. (should be roughly the case if they are caused by noise), the error percentage for \(M=2\) is around 50% (equal to flipping a coin), while for \(M=8\) the maximum percentage is at 87.5%.

Noisy channel

First of all, let’s verify once more that the setup works as expected. We do that by checking whether the received signal strength is changing more or less linearly with the transmit gain.

ggplot(dat_noise,
       aes(x = factor(gain), y = level_signal-gain)) + geom_boxplot() +
  scale_x_discrete(drop=FALSE) +
  xlab("Tx Gain (dB)") + ylab("RSS-Tx Gain Offset (dB)")

Looks good. While there is a slight drift (probably due to inaccuracy of the transmit gain), the difference between transmit gain and received signal strength remains almost constant around -100.7dB during all measurements.

We will now look into the bit error rates in a noisy communication channel. The noise was generated by replaying a continuous recording from Frankfurt Airport while varying the transmit power of the reference frame with phase overlay.

ber_stats_noise = dat_noise[, .(count = .N,
                                mean = mean(psk_payload_errors),
                                p95 = quantile(psk_payload_errors, 0.95),
                                smean = mean(psk_symbol_errors), 
                                sp95 = quantile(psk_symbol_errors, 0.95)),
                            by=.(gain, m)]
ber_stats_noise$bits = 112 * log2(ber_stats_noise$m)
ber_stats_noise$rss = ber_stats_noise$gain - 101 # 3dB for the splitter
ber_stats_noise$snr = ber_stats_noise$rss - -102
ber_stats_noise$dist = 10^((56-33.2-ber_stats_noise$rss)/20)

ggplot(ber_stats_noise, aes(x = rss, color = as.factor(m))) +
  geom_line(aes(y = p95/bits * 100, linetype="95-Percentile")) +
  geom_line(aes(y = mean/bits * 100, linetype="Mean")) +
  scale_linetype_manual(values=c("dotted", "solid")) +
  xlab("Signal Strength (dBm)") + ylab("Bit Errors (%)") +
  labs(color = "M", linetype="")

Comparison Noise vs. No Noise:

ggplot(ber_stats_noise, aes(x = rss, color = as.factor(m))) +
  geom_line(aes(y = mean/bits * 100, linetype="Noisy")) +
  geom_line(data = ber_stats, aes(y = mean/bits * 100, linetype="Noise-free")) +
  xlab("Signal Strength (dBm)") + ylab("Average Bit Errors (%)") +
  scale_linetype_manual(values=c("dotted", "solid")) +
  labs(color = "M", linetype="") +
  coord_cartesian(xlim = c(min(ber_stats$rss), max(ber_stats_noise$rss)))

And if we look at it again in terms of distances between transponder and receiver:

ber_stats_noise$dist = 10^((56-33.2-ber_stats_noise$rss)/20)

ggplot(ber_stats_noise, aes(x = dist/1852, color = as.factor(m))) +
  geom_line(aes(y = mean/bits * 100, linetype="Noisy")) +
  geom_line(data = ber_stats, aes(y = mean/bits * 100, linetype="Noise-free")) +
  xlab("Distance (NM)") + ylab("Avg. Bit Errors (%)") +
  scale_linetype_manual(values=c("dotted", "solid")) +
  labs(color = "M", linetype="") +
  coord_cartesian(xlim = c(min(ber_stats$dist), max(ber_stats_noise$dist))/1852)

And finally in terms of channel capacity:

ggplot(ber_stats_noise, aes(x = dist/1852, color = as.factor(m))) +
  geom_line(aes(y = bits - p95, linetype="95-Percentile")) +
  geom_line(aes(y = bits - mean, linetype="Mean")) +
  scale_linetype_manual(values=c("dotted", "solid")) +
  coord_cartesian(ylim = c(0, max(ber_stats$bits)), xlim=c(0, max(ber_stats_noise$dist/1852))) +
  xlab("Distance (NM)") + ylab("Correct Bits per Frame") +
  ggtitle("Assumed Tx Power of 56dBm") +
  labs(color = "M", linetype="")

To finalize this analysis, we now add the approximate overhead for forward error correction (FEC) to the formula. We assume that Reed-Solomon codes were used for FEC, which add approximately twice the number of bits overhead that the code should be able to correct. In our example, we assume that we want to be able to correct 95% of the received frames at a distance of about 250NM. Hence, we’ll assume an FEC overhead of twice the 95-percentile of erroneous bits for the different \(M\):

rs_overhead = ber_stats_noise[gain==10, .(m, p95, bits)]
rs_overhead$overhead = rs_overhead$p95*2
rs_overhead$net_capacity = rs_overhead[, bits-overhead]

ggplot(rs_overhead, aes(x=as.factor(m), y=net_capacity)) +
  geom_bar(stat = "identity", width = 0.5) +
  geom_text(aes(label = net_capacity), vjust = -0.2) +
  xlab("M") + ylab("Net Capacity (Bits per Transmission)")

We conclude that in a noisy environment, \(M=8\) provides the highest phase overlay capacity.

Carrier Frequency Offsets

ggplot(dat_freqo, aes(x = factor(as.integer(fo*1000)), y = level_signal)) +
  geom_boxplot() + scale_x_discrete(drop=FALSE, breaks=seq(-250, 250, 50)) +
  xlab("Frequency Offset (kHz)") + ylab("RSS (dBm)")

ber_stats_freqo = dat_freqo[, .(count = .N,
                                mean = mean(psk_payload_errors),
                                p95 = quantile(psk_payload_errors, 0.95),
                                smean = mean(psk_symbol_errors), 
                                sp95 = quantile(psk_symbol_errors, 0.95)),
                            by=.(fo, m)]

ber_stats_freqo$bits = 112 * log2(ber_stats_freqo$m)

ggplot(ber_stats_freqo, aes(x = fo*1000, color = as.factor(m))) +
  geom_line(aes(y = mean/bits * 100, linetype="Mean")) +
  geom_line(aes(y = p95/bits * 100, linetype="95-Percentile")) +
  xlab("Carrier Frequency Offset (kHz)") + ylab("Average Bit Errors (%)") +
  scale_linetype_manual(values=c("dotted", "solid")) +
  labs(color = "M", linetype="")

Conclusions

We conclude that a phase overlay design with \(M=8\) performs best in a realistic radio environment. It can provide an additional capacity of about 218 bits while maintaining a good robustness against carrier phase offsets of up to 40kHz.