Each word in SUC has been annotated with information about part-of-speech, morphological features and citation form. SUC is a balanced corpus, which means that it consists of texts from a wide variety of genres in carefully selected proportions. The texts in SUC were written in the 1990s. SUC has been released in three versions: SUC 1.0 (1997), SUC 2.0 (2006) and SUC 3.0 (2012).

The most recent release, SUC 3.0, contains thousands of changes towards a better and more consistent annotation. Additionally, the full text of more works covered by the SUC license has been added, supplementing the bonus materials distributed in SUC 2.0. The Stockholm Internet Corpus (SIC) consists of blog texts using the annotation scheme in SUC, and for convenience this material is also distributed together with SUC 3.0.

Licensing of SUC has been delegated to Språkbanken at the University of Gothenburg, but Mats Wirén at Stockholm University is authorized to provide copies of SUC to local teaching staff at the Department of Linguistics upon receiving a signed copy of the SUC license (pdf) (38 Kb) . The license agreement is common to all versions of the corpus, so if you have already signed it for a previous version, you can simply contact Språkbanken to obtain the most recent version of SUC.

Språkbanken has an online concordancer, Korp, including SUC (version 2.0 and 3.0). This concordancercan be used by anyone, without the need to sign the SUC license.