Two datasets: a. a text-to-sound aligned corpus of spoken data totalling 275 hours and 3 million words from over 500 speakers in 146 locations across Scotland b. over 100,000 acceptability judgments across c200 data points.