Scrolling Nav
NUScon 2018 Rules

This document serves to convey the intentions and spirit of the first annual NUScon competition. The organizers have prepared this document as guidance for participants, but reserve the right to change these specifications as they deem necessary in order to maintain the integrity of the competition and best serve the NMR community.

These rules and associated scoring metrics are also provided as PDFs: Rules | Scoring Metrics
Scope

Intent
The intent of NUScon 2018 is to pose NUS challenge questions and solicit solutions from the NMR community in order to identify best practices for sampling and spectral reconstructions for a select set of 3D experiments.

Approach
A public set of uniformly sampled (US) data and sample schedules define the challenge problems. Contestants submit processing scripts that operate on nonuniformly sampled (NUS) data, which is obtained by subsampling the US data with the provided sample schedules. The scoring of submissions is driven by a set of private synthetic peak lists and private sample schedules. The private synthetic peaks, which match the characteristics of the public US data, are injected into the US data, which is then subsampled with private sample schedules that match the same characteristics of the public schedules, but vary by random seed. The ability of each submitted script to recover the synthetic peaks is scored by metrics of sensitivity, resolution, frequency accuracy, and intensity accuracy. The best performing entries are awarded cash prizes from a pool of $25,000, made available through a very generous donation from David and Miriam Donoho.

Deliverables
A manuscript that presents a summary of the competition results, including a discussion of the winning methodologies will be prepared. The following resources employed by the competition will be made available through NMRbox:

Eligibility
All NMRbox users, including NUScon organizers and their labs are eligible to submit entries and to win prizes. The NUScon organizers responsible for creating the private synthetic peak lists will be ineligible to win prizes. Individuals or teams may submit entries. If submitting a team entry, do so from the NMRbox account of one member and include a list of all team members, as instructed in the Submission section.

Overview
This is a brief outline of the contest. The details are addressed in the subsequent sections:
Data

Challenge Problems
The challenge problems are listed on the NUScon Challenges Page.
The problems include triple resonance experiments (HNCA and HNCACB) and NOESY experiments (15N-NOESY and 13C-NOESY). The proteins studied in each experiment are denoted "A", "B", etc.

NMRbox Resources
Resources for the contestants are located at the following locations on the NMRbox platform:

/NUScon
/challenges
/challenges_<#>.tar.gz
zipped archive of the challenge (good to download)
/challenges_<#>
unpacked challenge (good to browse)
/<protein>-<experiment>
sub-folder for specific experiment
/sample_schedules
example sample schedules
/US_data
uniformly sampled data
fid.com
convert time data to nmrPipe format (produces /fid)
nmr_ft.com
process time data (produces /ft and /ftproj)
/fid
/ft
/ftproj
/nuscon2018_generic.com
template for contestants to build their processing script
/submissions
directory for contestant submissions
/utilities

Challenge problems may be browsed in the /NUScon folder, but to work on a challenge problem, a contestant should copy the corresponding compressed archive file to their account as follows:


Explanation of tar command:
-x extract files from an archive
-f use an archive file
/NUScon/challenges/challenge_1.tar.gz location of file

The tar files do not contain the /fid and /ft directories noted above (to reduce file size). These directories may by reproduced in your copy of the challenge data by running fid.com and nmr_ft.com, respectively.

Sample Schedules
For challenge problems that require the use of a provided sample schedule, the following schemes are used:

Each sampling scheme will be used to provide sample schedules at 3 levels of coverage, which will be determined by the experiment type and characteristics of the data. These sample schedules are used for developing a submission. Additional sample schedules following the same parameters as provided above, but varying the random seed value, will be generated for scoring the submissions. These are kept blind from the contestants and will be generated at the time of scoring.

Processing Scripts for Uniform Data
The NUScon organizers will proved a script for processing each uniform experimental data set. This script includes several parameters that the contestant may use for guidance in writing their submission script. Some parameters (as noted below) are not allowed to be adjusted.
Submission

Building your submission
A template script for your submission is provided for each challenge problem. The challenges are posted here, and each one provides the location of its template script within NMRbox.

Requirements for submission script filename

Requirements for submission script inputs and outputs
The following requirements are intended to standardize and streamline the scoring procedure.

Requirements for submission script documentation

Ethics
Contestants must abide by the NUScon ethics statement:
Any script that is not "reasonable" or tries to cheat the challenge problem is disqualified.

This includes, but is not limited to, the following:
Scoring

Synthetic Peak Lists
There will be multiple synthetic peak lists created to probe the spectral properties identified by the metrics. The synthetic peak lists are generated by a subgroup of the NUScon organizers and will be kept private until the NUScon competition closes. The principles that guide the construction of the synthetic peak lists are as follows:
Number Peak lists contain relatively few peaks, so as not to significantly alter the spectral density of the data set.
Location Peaks are injected in empty regions of the uniform data set.
Characteristics Synthetic peaks are constructed with amplitudes, linewidth, shape, phase, and distortions similar to those observed in the corresponding uniform data set.
NOESYs For 3D NOESYs, the cross-peaks are accompanied by the corresponding diagonal signal, who’s amplitude is typical for the diagonals in the uniform spectrum. The diagonal peaks and cross-peaks have amplitudes, linewidth, shape, phase, and distortions similar to those observed in the corresponding uniform data set. The diagonal peaks are not used for the scoring metrics and thus may overlap with other signals in the spectrum.

Peak Picking
A single "standard" peak picking script using nmrPipe will be used on all spectra. The peak picking script enlarges the local neighborhood size according to the number of zero fills. This ensures that the identification of peaks is not biased by variations in digital resolution. The peak picking script will be posted in in NMRbox at: /NUScon/utilities

Metrics and Scoring
The following categories are scored:
These metrics are explicitly defined in the accompanying Scoring Metrics.

Scoring Categories
The metrics defined above are used in conjunction to determine scores for the following categories:

Awards
For each challenge problem, there are funds allocated for awards in Fidelity, Detection, and Best Overall. The number of awards given in each category will depend on the number of submissions and the extent to which submissions separate themselves in quality from the others. The NUScon organizing committee reserves all rights to adjust this format and their say is final.