Back to list

Waldo Spectrogram Dataset for Signal Detection and Localization in the Citizen Broadband Radio Service (CBRS) Band


Dataset Download:
Please use the following link to download the dataset containing ~1300 spectrograms of the CBRS band created in the simulated MATLAB environment:

Download: Spectrograms of the CBRS band containing 5G, LTE, DSSS, and Radar signals

This dataset is created for the task of signal detection and localization in the CBRS band, and is used in "Finding Waldo in the CBRS Band: Signal Detection and Localization in the 3.5 GHz Spectrum," published in IEEE GLOBECOM 2022. Any use of this dataset that results in any kind of publication with a bibliography section, should include a citation to our paper. Here is the PDF and the reference for the paper:

Paper PDF
Nasim Soltani, Vini Chaudhary, Debashri Roy, and Kaushik Chowdhury, "Finding Waldo in the CBRS Band: Signal Detection and Localization in the 3.5 GHz Spectrum," IEEE GLOBECOM, December 2022, pp. 4570-4575.



Code: For the experiments in the paper, this dataset is used with YOLOv3 Keras-Tensorflow implementation that is publicly available, however, after converting the annotation files, another publicly available repository that contains YOLOv3 pytorch implementation can be used for signal detection too. Instructions for training and test with both frameworks are available in their corresponding README files.

Dataset Description:

This dataset contains ~1300 spectogram images generated in the MATLAB simulated environment from the CBRS band. Each image file is accompanied with a corresponding annotation file in the standard .xml format that contains signal labels and their pixel bounding boxes. To create the dataset, the simulated CBRS band is monitored and recorded for 20 ms with 100 MHz sampling rate, and a spectrogram is created using each observation. In each spectrogram there might be one or both of 5G and LTE signals that are not overlapping but might be in close proximity. In each spectrogram there might or might not be a DSSS signal or a Radar signal overlapping with either one of the cellular transmissions. Our dataset is completely compliant with The Federal Communications Commission (FCC) regulations for the CBRS band, and has the following properties:

  • Power values: As mandated by FCC regulations, the cumulative power of Additive White Gaussian Noise (AWGN) and other sources of interference must be no more than -109 dBm/MHz around the ESC sensor. We define SINR as 10 × log10(P_radar/P_noise), where P_radar is peak radar power per MHz, and P_noise is average noise and interference power per MHz of the radar band. In this case, radar pulses with at least -89 dBm/MHz of peak power (at least 20 dB SINR) must be detected with 99% accuracy. We vary the power of radar pulses in the range of -89 to -79 dBm/MHz to emulate SINRs in the range of 20 to 30 dB. For the noise and interference power, we go upto 5 dB beyond FCC limit, and allow the noise and interference power to vary in the range of -109 to -104 dBm/MHz, to create a challenging dataset. In the scenes where radar overlaps with the PAL user signal (i.e., LTE or 5G), we allocate 25% of the power to AWGN and 75% to the LTE/5G signal, to ensure the SINR is in the range of 15 and 30 dB.

  • Signal bandwidth: According to FCC regulations, each CBRS vendor can purchase and aggregate up to four 10 MHz channels in the CBRS band. Therefore, the largest bandwidth that PAL users can transmit is 40 MHz. Based on this, we randomize the bandwidth of 5G and LTE signals to have all the standard bandwidths in the range of 5 to 40 MHz.

  • Radar parameters: We use standard compliant radar type-1 that is used in naval radars. We use the tool released by National Institute of Standards and Technology (NIST), and set the pulse width of radar pulses to 0.5 µs. We set the pulse per burst parameter as 20 and pulse repetition rate as 1010 per second.

  • Sampling rate and duration: The 10 MHz channels in the CBRS band can be monitored separately for radar pulses with sampling rate of 10 MHz, or the whole CBRS band can be monitored at once with a higher sampling rate. As the PAL users can exist in the first 100 MHz of the CBRS band between 3.55 GHz and 3.65 GHz, we monitor the entire 100 MHz at once and samples the wireless channel with 100 MHz sampling rate. Consequently, the spectrograms that we create out of the sampled data span 100 MHz on the horizontal axis. We sample the CBRS band for 20 ms, and so the time-scale of the spectrogram spans to 20 ms. We assume that all the four signals of radar, DSSS, 5G, and LTE that might appear in each scene span throughout this 20 ms.

  • LTE and 5G spectrum usage technique: As per FCC regulations in the CBRS band, we create LTE and 5G signals in the Time Division Duplex (TDD) mode.
Detailed description of the dataset is given in Section III of the paper. A sample spectrogram in the dataset is shown below in Figure 1.

Figure 1. An example spectrogram in the dataset with all 4 signal types of LTE, 5G, DSSS, and Radar.