Dataset of depressive posts in Russian language collected from social media

Narynov S., Mukhtarkhanuly D., Omarov B.

DATA IN BRIEF, vol.29, 2020 (Peer-Reviewed Journal) identifier identifier

  • Publication Type: Article / Article
  • Volume: 29
  • Publication Date: 2020
  • Doi Number: 10.1016/j.dib.2020.105195
  • Journal Name: DATA IN BRIEF
  • Journal Indexes: Emerging Sources Citation Index, Scopus, BIOSIS, Directory of Open Access Journals


This paper presents dataset collected from social networks that are mostly used by youth of Commonwealth of Independent States (CIS) countries. The data was collected from public accounts of VKontakte social network by using VK.api and applying the most used keywords that would signify depressive mood. The collected data was classified by psychologists into two types: depressive and non-depressive. The dataset consists of 32 018 depressive posts and 32 021 non-depressive posts. Since the most common language that is spoken in CIS countries is Russian, the posts are written in Russian, consequently the collected data is in Russian language as well. The data can mostly be useful for researchers who explore tendencies to depression in CIS countries. The dataset is important for the research community, as it was not only collected from open sources, but also marked by our psychiatrists from the republican scientific and practical center of mental health. Since the dataset has very high validity, it can be used for further research in the field of mental health. (c) 2020 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (