Back to list

Datasets of Synthetically Generated and Captured IQ Samples and Cycrostationary Signal Processing (CSP) Features for Detecting Anomalous RF Underlay Signals


Download our datasets:
Please use below links to download the datasets:
Generated IQ samples and CSP features of MATLAB generated LTE and DSSS signals
Captured IQ samples and CSP features of over-the-air transmission of LTE and DSSS signals from srsLTE base stations
Captured IQ samples and CSP features of over-the-air transmission of commertial LTE and DSSS signals from in the wild [Upcoming]
These datasets were used for the paper "ICARUS: Learning on IQ and Cycle Frequencies for Detecting Anomalous RF Underlay Signals" to be published in IEEE INFOCOM 2023. Please use this link to download the paper. Any use of this dataset which results in an academic publication or other publication which includes a bibliography should include a citation to our paper. Here is the reference for the work:

Conference version: PDF
D. Roy, V. Chaudhury, C. Tassie, C. Spooner, and K. R. Chowdhury, "ICARUS: Learning on IQ and Cycle Frequencies for Detecting Anomalous RF Underlay Signals,” IEEE INFOCOM 2023, New York Area, USA, May, 2023.


Description:
ICARUS presents a machine learning based framework that offers choices at the physical layer for inference with inputs of (i) in-phase and quadrature (IQ) samples only, (ii) cycle-frequency features obtained via cyclostationary signal processing (CSP), and (iii) fusion of both, to detect the underlay DSSS signal and its modulation type within LTE frames. ICARUS chooses the best inference method considering both the expected accuracy and the computational overhead. ICARUS is rigorously validated on multiple real-world datasets that include signals captured in cellular bands in the wild and the NSF POWDER testbed for advanced wireless research (PAWR). Results reveal that ICARUS can detect DSSS anomalies and its modulation scheme with 98-100% and 67-99% accuracy, respectively, while completing inference within 3-40 milliseconds on an NVIDIA A100 GPU platform.

Architecture

Fig. 1: Steps in detecting an anomaly and recognizing its modulation type within the baseline signal by selecting one of the four color-coded pipelines composed of neural networks (NNs) and/or signal processors: (1) signal processing on CSP features, (2) NN on IQ only, (3) NN on CSP features only, (4) NN fusing both IQ and CSP features. One of these four inference pipelines must be chosen considering computational resource constraints.

Training

Fig. 2: The ICARUS training framework. Different components are detailed in the main paper.

Testing

Fig. 3: The ICARUS inference framework. Here the input signal is either LTE or LTE with underlay DSSS signals.


Data Collection Setup
We generate and collect (a) baseline LTE, and (b) LTE with underlay DSSS sigals waveforms in three different dataset modules. They are: (i) Synthetic Dataset: We generate a dataset of synthetic LTE and DSSS waveforms using MATLAB R2021b and combine them in an emulated channel environment; (ii) Indoor OTA-PAWR Dataset with srsLTE Waveforms: We collect this dataset from the NSF POWDER testbed of PAWR platform using srslte-otalab profile, which provides resources for performing over-the-air (OTA) operation in their indoor lab. This collection emulates a private LTE network. We use one USRP X310 operating at 3.56 GHz center frequency as an srsLTE eNodeB for transmitting LTE FDD downlink signals. We use two USRP B210s operating at 3.56 GHz center frequency, one as an srsLTE UE and another for collecting the IQ samples of the downlink srsLTE signal. (iii) OTA-Cellular Dataset w/ Commercial Cellular Waveforms: we capture ambient downlink OTA LTE signals present in the cellular PCS bands, after which we add either synthetic DSSS (BPSK/QPSK) or OTA-captured DSSS BPSK signals. We release the LTE siganls with only syntetic DSSS as underlay in this webpage. We use USRP X310 for collecting signals at sampling rates {7.68, 15.36, 30.72} MHz. Details of these datasets are provided in the paper.

Data Description
Each dataset folders contain three main subfolders: (i) IQ, (b) CSP_Features, (c) Metadata.

(i) IQ: This folder contains the IQ samples of the OTA transmission or MATLAB generated waveforms in binary format.

(i) CSP_Features: This folder contains the extracted conjugate (C) and non-conjugate (NC) CSP features on different block lengths for different datasets. The files are identified by their block length sometimes in the filenames itself, or with different subfolders. The POWDER dataset has CSP features for 5 block lengths: (i) 32768, (ii) 65536, (iii) 131072, (iv) 262144, and (v) 524288. The rest two datasets contains CSP features for three block lengths only: (i) 131072, (ii) 262144, and (iii) 524288.

(i) Metadata: This folder contains metadata about the synthetic MATLAB dataset and POWDER dataset in csv . For the commercial LTE dataset, there are two integers before the .tim extension of the IQ filename. These are a 'loop index' and a 'signal index'. When the loop index is odd, the data file corresponds to LTE with underlay DSSS siganl, and when even, baseline LTE. For example, the lte_10MHz_4_1_fs_15.36MHz_12_16.tim file contains a baseline LTE signal. The rest of the metadata information are in dsss_signal_params.txt file.

Number of Samples: Synthetic: 210 (LTE) + 420 (LTE + DSSS); OTA-POWDER: 500 (LTE) + 2000 (LTE + DSSS); OTA-Cellular: 2430 (LTE) + 2430 (LTE + DSSS)

Exp-Setup

Fig. 4: The overview of the collected datasets for evaluating ICARUS.