• Erna Nurmawati Politeknik Statistika STIS
  • Adielia Amanda Politeknik Statistika STIS
Keywords: BPS data, IndoBERT, LDA, Sentiment Analysis, Topic Modeling


Annually, the Central Bureau of Statistics, known in Indonesia as BPS or Badan Pusat Statistik, conducts a routine Data Needs Survey (Survei Kebutuhan Data or SKD) to identify data requirements and the level of consumer satisfaction with the quality of data produced by BPS. However, SKD respondents are limited to consumers who have received services from the Integrated Statistics Services (Pelayanan Statistik Terpadu or PST) unit at BPS within a specific year. To gather opinions from the wider public accessing BPS data through channels other than the PST unit, an alternative approach is necessary – particularly through social media, specifically Twitter.

This study employs Twitter data to analyze public sentiment regarding BPS data. To understand the distribution of topics discussed within the community about BPS data indicators, topic modeling has been employed. The sentiment analysis process utilizes IndoBERT, an Indonesian language Bidirectional Encoder Representations from Transformers (BERT) model. For topic modeling, the Latent Dirichlet Allocation (LDA) method is utilized.

The results of sentiment analysis during the period 2020 - 2022 reveal that tweets related to BPS data generally convey a neutral sentiment. Meanwhile, the topic modeling process generates a range of topics, with variations observed in each year. Throughout 2020 - 2022, the most frequently discussed topics align with the statistical data from the 2020 - 2022 Data Needs Survey's data requirements section, reflecting the diversity of data needs.


Adriansyah, R. (2020). Sentiment Analysis tentang Badan Pusat Statistik Berdasarkan Media Online. Bachelor's thesis, Jakarta: Politeknik Statistika STIS.

Albalawi, R., Yeap, T., and Benyoucef, M. (2020) Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis. Frontiers in Artificial Intelligence, vol. 3, no. 42

Aliyah Salsabila, N., Ardhito Winatmoko, Y., Akbar Septiandri, A dan Jamal, A. (2018). "Colloquial Indonesian Lexicon", 2018 International Conference on Asian Language Processing (IALP), pp. 226-229.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 1(Mlm), 4171–4186.

Li, Z., Fan, Y., Jiang, B., Lei, T. and Liu, W. (2019). A survey on sentiment analysis and opinion mining for social multimedia. Multimedia Tools and Applications, 78(6), pp.6939-6967.

Musyarof, Z. (2019). “Analisis Text Mining terhadap BPS (Badan Pusat Statistik) di Twitter Menggunakan R” in Seminar Karya Tulis Ilmiah BPS Provinsi Kalimantan Selatan, Kalimantan Selatan.

Palen, L., & Vieweg, S. (2008). The emergence of online widescale interaction in unexpected events: Assistance, alliance, & retreat. Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW, 117-126.

Putri, C., Adiwijaya, & Alfarabi, S. (2020). Analisis Sentimen Review Film Berbahasa Inggris Dengan Pendekatan Bidirectional Encoder Representations from Transformers. Jurnal Teknik Informatika dan Sistem Informasi, Vol. 6, No. 2, 181-193.

Ren, Z., Shen, Q., Diao, X., and Xu, H. (2021). “A sentiment-aware deep learning approach for personality detection from text”. Information Processing & Management, vol. 58, no. 3, Article ID 102532.

Sahria, Y., and Fudholi, D.H. (2020). "Analysis of Health Research Topics in Indonesia Using the LDA (Latent Dirichlet Allocation) Topic Modeling Method", Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 2, pp. 336-344.

Shalvee, & Sambhav, S. (2020). Role of mass media and communication during pandemic. Internasional Journal of Creative Research Thoughts, 8(5), 3786-1790

Somantri, O. and Dairoh, D. (2019) “Analisis Sentimen Penilaian Tempat Tujuan Wisata Kota Tegal Berbasis Text Mining,” J. Edukasi dan Penelit. Inform., vol. 5, no. 2, pp. 191–196.

Wilie, B., Vincentio, K., Winata, G. I., Cahyawijaya, S., Li, X., Lim, Z. Y., Soleman, S., Mahendra, R., Fung, P., Bahar, S., & Purwarianti, A. (2020). IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding. Proceedings of the 1 st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistic and the 10th International Joint Conference on Natural Language Processing.

Zanini, N., & Dhawan, V. (2015). Text Mining. An Introduction to theory and some applications. Research Matters,, 38-44.
How to Cite
Nurmawati, E., & Amanda, A. (2023). ANALISIS SENTIMEN DAN PEMODELAN TOPIK PADA TWEET TERKAIT DATA BADAN PUSAT STATISTIK. Jurnal Sistem Informasi Dan Informatika (Simika), 6(2), 165-176.